技能 Azure AI VoiceLive SDK for Java
📦

Azure AI VoiceLive SDK for Java

安全

使用 Azure AI 构建实时语音应用

开发者需要一种方式将实时双向语音对话集成到 Java 应用程序中。此技能提供了使用 Azure AI VoiceLive SDK 和 WebSocket 流式传输实现语音 AI 的完整代码示例和最佳实践。

支持: Claude Codex Code(CC)
🥉 72 青铜
1

下载技能 ZIP

2

在 Claude 中上传

前往 设置 → 功能 → 技能 → 上传技能

3

开启并开始使用

测试它

正在使用“Azure AI VoiceLive SDK for Java”。 Create a VoiceLive client with DefaultAzureCredential

预期结果:

VoiceLiveAsyncClient client = new VoiceLiveClientBuilder()
.endpoint(System.getenv("AZURE_VOICELIVE_ENDPOINT"))
.credential(new DefaultAzureCredentialBuilder().build())
.buildAsyncClient();

正在使用“Azure AI VoiceLive SDK for Java”。 Configure turn detection for natural conversation

预期结果:

ServerVadTurnDetection turnDetection = new ServerVadTurnDetection()
.setThreshold(0.5)
.setPrefixPaddingMs(300)
.setSilenceDurationMs(500)
.setInterruptResponse(true)
.setAutoTruncate(true);

安全审计

安全
v1 • 2/24/2026

Prompt-only documentation skill containing code examples for Azure AI VoiceLive SDK. Static analysis found 0 files scanned with 0 potential security issues. Risk score: 0/100. No suspicious patterns detected. Skill provides legitimate documentation for Microsoft Azure service with no executable code.

0
已扫描文件
0
分析行数
0
发现项
1
审计总数
未发现安全问题
审计者: claude

质量评分

38
架构
100
可维护性
87
内容
50
社区
100
安全
74
规范符合性

你能构建什么

客户服务语音机器人

构建用于客户支持的交互式语音助手,使用自然语音实时处理咨询

无障碍工具开发

为偏好语音交互而非文本界面的用户创建支持语音的应用程序

IoT 语音控制界面

为 IoT 设备实现低延迟双向通信的语音控制

试试这些提示

Basic Voice Client Setup
Show me how to set up a basic VoiceLiveAsyncClient in Java with API key authentication using the Azure AI VoiceLive SDK.
Configure Voice Session
How do I configure VoiceLiveSessionOptions with turn detection, voice selection, and audio format settings for a natural conversation flow?
Handle Voice Events
Write Java code to handle voice events including speech start/stop detection, audio delta streaming, and error handling in the VoiceLive session.
Implement Function Calling
Show me how to integrate function calling with VoiceLive to enable the AI assistant to execute real actions like weather lookups during conversation.

最佳实践

  • Use DefaultAzureCredential instead of API keys for production deployments to leverage Azure managed identities
  • Configure ServerVadTurnDetection with appropriate threshold and silence duration to match your use case for natural conversation flow
  • Always implement proper error handling and reconnection logic for production voice applications

避免

  • Do not hardcode API keys in source code - use environment variables or Azure Key Vault instead
  • Avoid blocking calls in reactive streams - use non-blocking patterns throughout
  • Do not skip audio format validation - ensure input matches 24kHz 16-bit PCM requirements

常见问题

What audio format does Azure AI VoiceLive require?
Azure AI VoiceLive requires 24kHz sample rate, 16-bit PCM, mono channel, signed PCM little-endian format.
How do I authenticate with Azure AI VoiceLive?
You can use AzureKeyCredential with API key, or DefaultAzureCredential for managed identity support in production.
What voices are available for Azure AI VoiceLive?
The SDK supports OpenAI voices (ALLOY, ASH, BALLAD, CORAL, ECHO, SAGE, SHIMMER, VERSE) and Azure voices including Standard, Custom, and Personal voices.
Can I use function calling with VoiceLive?
Yes, you can define functions using VoiceLiveFunctionDefinition and pass them via setTools() in VoiceLiveSessionOptions.
How does turn detection work in VoiceLive?
ServerVadTurnDetection uses voice activity detection to automatically detect when the user starts and stops speaking, with configurable threshold and silence duration.
What is the difference between TEXT and AUDIO modalities?
TEXT modality sends/receives text, AUDIO modality sends/receives audio. You can combine both using Arrays.asList(InteractionModality.TEXT, InteractionModality.AUDIO).

开发者详情

文件结构

📄 SKILL.md