Skills @azure/ai-voicelive (JavaScript/TypeScript)
🎙️

@azure/ai-voicelive (JavaScript/TypeScript)

Safe

Build Real-Time Voice AI Apps with Azure

This skill provides comprehensive documentation and code examples for building real-time voice AI applications using the Azure AI Voice Live SDK. It enables developers to create voice assistants, conversational AI, and speech-to-speech applications in JavaScript and TypeScript.

Supports: Claude Codex Code(CC)
📊 70 Adequate
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "@azure/ai-voicelive (JavaScript/TypeScript)". How do I create a VoiceLiveClient with DefaultAzureCredential?

Expected outcome:

import { DefaultAzureCredential } from '@azure/identity';
import { VoiceLiveClient } from '@azure/ai-voicelive';

const credential = new DefaultAzureCredential();
const endpoint = process.env.AZURE_VOICELIVE_ENDPOINT!;
const client = new VoiceLiveClient(endpoint, credential);

Using "@azure/ai-voicelive (JavaScript/TypeScript)". What voice options are available?

Expected outcome:

Azure Standard voices (en-US-AvaNeural, etc.), Azure Custom voices with endpoint ID, Azure Personal voices for speaker cloning, and OpenAI voices (alloy, echo, shimmer).

Using "@azure/ai-voicelive (JavaScript/TypeScript)". How do I handle errors?

Expected outcome:

Use the onError handler in your subscription to catch VoiceLiveConnectionError, VoiceLiveAuthenticationError, and VoiceLiveProtocolError types.

Security Audit

Safe
v1 • 2/24/2026

This is a documentation-only skill containing guidance for using the Azure AI Voice Live SDK. No executable code was detected. The skill provides usage examples for a legitimate Azure service. No security concerns identified.

0
Files scanned
0
Lines analyzed
0
findings
1
Total audits
No security issues found
Audited by: claude

Quality Score

38
Architecture
100
Maintainability
87
Content
32
Community
100
Security
83
Spec Compliance

What You Can Build

Build Voice Assistants

Create interactive voice assistants that can understand speech, respond with AI-generated audio, and handle multi-turn conversations.

Real-Time Transcription

Implement live speech-to-text transcription with low latency for customer service, accessibility, or documentation applications.

Conversational Chatbots

Build voice-enabled chatbots that can have natural spoken conversations with users using GPT models.

Try These Prompts

Basic Voice Client Setup
Show me how to set up a basic VoiceLiveClient using Microsoft Entra ID authentication in TypeScript.
Session Configuration
Configure a voice session with text and audio modalities, custom instructions, and Azure Semantic VAD turn detection.
Event Handling
Implement event handlers for streaming audio delta, text delta, and transcription events using the subscription pattern.
Function Calling
Set up function calling tools in the session configuration and handle function call events to integrate external APIs.

Best Practices

  • Always use DefaultAzureCredential instead of hardcoding API keys for secure authentication
  • Use Azure Semantic VAD for better turn detection than basic server VAD
  • Clean up subscriptions by calling subscription.close() when done to prevent memory leaks

Avoid

  • Hardcoding API keys directly in source code instead of using environment variables or Entra ID
  • Not handling connection, authentication, and protocol errors separately
  • Setting only audio modality without text - this breaks many conversational features

Frequently Asked Questions

What authentication methods are supported?
Microsoft Entra ID (recommended) and API key authentication using AzureKeyCredential.
What environments are supported?
Node.js LTS (20+) and modern browsers (Chrome, Firefox, Safari, Edge) with a bundler.
What audio formats are supported?
PCM16 at 24kHz (default), PCM16-8kHz, PCM16-16kHz, G711 ulaw, and G711 alaw.
How does turn detection work?
Server VAD uses voice activity detection. Azure Semantic VAD uses AI to understand conversational context for better turn-taking.
Can I use custom voices?
Yes, Azure Custom voices with endpoint ID and Azure Personal voices for speaker profile cloning are supported.
What models are supported?
GPT-4o-realtime-preview, GPT-4o-mini-realtime-preview, and phi4-mm-realtime for cost-effective applications.

Developer Details

File structure

📄 SKILL.md