Skills Azure AI VoiceLive SDK for Java

📦

Azure AI VoiceLive SDK for Java

Name: Azure AI VoiceLive SDK for Java
Author: sickn33

Safe

Build Real-Time Voice Apps with Azure AI

Developers need a way to integrate real-time bidirectional voice conversations into Java applications. This skill provides complete code examples and best practices for implementing voice AI using Azure AI VoiceLive SDK with WebSocket streaming.

Supports: Claude Codex Code(CC)

📊 69 Adequate

Download the skill ZIP

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

Toggle on and start using

Test it

Using "Azure AI VoiceLive SDK for Java". Create a VoiceLive client with DefaultAzureCredential

Expected outcome:

VoiceLiveAsyncClient client = new VoiceLiveClientBuilder()
.endpoint(System.getenv("AZURE_VOICELIVE_ENDPOINT"))
.credential(new DefaultAzureCredentialBuilder().build())
.buildAsyncClient();

Using "Azure AI VoiceLive SDK for Java". Configure turn detection for natural conversation

Expected outcome:

ServerVadTurnDetection turnDetection = new ServerVadTurnDetection()
.setThreshold(0.5)
.setPrefixPaddingMs(300)
.setSilenceDurationMs(500)
.setInterruptResponse(true)
.setAutoTruncate(true);

Security Audit

Safe

v1 • 2/24/2026

Prompt-only documentation skill containing code examples for Azure AI VoiceLive SDK. Static analysis found 0 files scanned with 0 potential security issues. Risk score: 0/100. No suspicious patterns detected. Skill provides legitimate documentation for Microsoft Azure service with no executable code.

Files scanned

Lines analyzed

findings

Total audits

No security issues found

Audited by: claude

Quality Score

Architecture

100

Maintainability

Content

Community

100

Security

Spec Compliance

What You Can Build

Customer Service Voice Bot

Build an interactive voice assistant for customer support that handles inquiries in real-time using natural speech

Accessibility Tool Development

Create voice-enabled applications for users who prefer voice interaction over text-based interfaces

IoT Voice Control Interface

Implement voice control for IoT devices with low-latency bidirectional communication

Try These Prompts

Basic Voice Client Setup

Show me how to set up a basic VoiceLiveAsyncClient in Java with API key authentication using the Azure AI VoiceLive SDK.

Configure Voice Session

How do I configure VoiceLiveSessionOptions with turn detection, voice selection, and audio format settings for a natural conversation flow?

Handle Voice Events

Write Java code to handle voice events including speech start/stop detection, audio delta streaming, and error handling in the VoiceLive session.

Implement Function Calling

Show me how to integrate function calling with VoiceLive to enable the AI assistant to execute real actions like weather lookups during conversation.

Best Practices

Use DefaultAzureCredential instead of API keys for production deployments to leverage Azure managed identities
Configure ServerVadTurnDetection with appropriate threshold and silence duration to match your use case for natural conversation flow
Always implement proper error handling and reconnection logic for production voice applications

Avoid

Do not hardcode API keys in source code - use environment variables or Azure Key Vault instead
Avoid blocking calls in reactive streams - use non-blocking patterns throughout
Do not skip audio format validation - ensure input matches 24kHz 16-bit PCM requirements

Frequently Asked Questions

What audio format does Azure AI VoiceLive require?

Azure AI VoiceLive requires 24kHz sample rate, 16-bit PCM, mono channel, signed PCM little-endian format.

How do I authenticate with Azure AI VoiceLive?

You can use AzureKeyCredential with API key, or DefaultAzureCredential for managed identity support in production.

What voices are available for Azure AI VoiceLive?

The SDK supports OpenAI voices (ALLOY, ASH, BALLAD, CORAL, ECHO, SAGE, SHIMMER, VERSE) and Azure voices including Standard, Custom, and Personal voices.

Can I use function calling with VoiceLive?

Yes, you can define functions using VoiceLiveFunctionDefinition and pass them via setTools() in VoiceLiveSessionOptions.

How does turn detection work in VoiceLive?

ServerVadTurnDetection uses voice activity detection to automatically detect when the user starts and stops speaking, with configurable threshold and silence duration.

What is the difference between TEXT and AUDIO modalities?

TEXT modality sends/receives text, AUDIO modality sends/receives audio. You can combine both using Arrays.asList(InteractionModality.TEXT, InteractionModality.AUDIO).

Developer Details

Author

sickn33

License

MIT

Repository

https://github.com/sickn33/antigravity-awesome_skills/tree/main/skills/azure-ai-voicelive-java

Ref

main

File structure

📄 SKILL.md