ai-avatar-video
Create AI Avatar and Talking Head Videos
Also available from: doany-ai,qu-skills,inference-sh-skills,infsh-skills,agentspace-so,inference-sh,skills-shell,runcomfy-com
Creating professional AI avatar videos traditionally requires complex video editing or expensive SaaS platforms. This skill provides a unified interface to generate talking head videos from images, audio, or text scripts using the inference.sh CLI.
Download the skill ZIP
Upload in Claude
Go to Settings → Capabilities → Skills → Upload skill
Toggle on and start using
Test it
Using "ai-avatar-video". Portrait image of a professional + script: 'Welcome to our quarterly review...'
Expected outcome:
A video file showing the portrait image with realistic lip movements synchronized to the generated speech audio, delivered as a downloadable video file.
Using "ai-avatar-video". Portrait image + existing audio file of a speech
Expected outcome:
A talking head video where the person in the image appears to deliver the speech with natural facial movements and accurate lip synchronization.
Using "ai-avatar-video". Original training video + translated Spanish audio
Expected outcome:
A version of the training video with the original visual presenter now speaking the translated Spanish audio with proper lip sync.
Security Audit
SafeDocumentation skill for AI video generation via inference.sh CLI. All static findings are false positives. The external_commands (29) are example CLI commands in code blocks demonstrating belt tool usage. The network URLs (20) reference the inference.sh service API endpoints and documentation. The weak_crypto flag (1) is a false positive triggered by YAML frontmatter text mentioning 'algorithm'. No malicious code, command injection, or data exfiltration patterns present.
High Risk Issues (1)
Medium Risk Issues (1)
Low Risk Issues (1)
Quality Score
What You Can Build
Product Demo Videos
Create engaging product demonstrations with an AI presenter. Upload a professional portrait and script your talking points - the avatar delivers your message with natural lip synchronization.
Training Content Localization
Translate training videos into multiple languages. Transcribe the original, translate the script, generate new audio, and sync to your presenter avatar for consistent global training materials.
Social Media Content Creation
Produce consistent avatar content for social channels. Generate talking head videos from portrait images with AI-generated voices, reducing video production costs and turnaround time.
Try These Prompts
Generate an avatar video using a portrait image with a text script and AI voice
Create a talking head video that syncs an existing portrait to a provided audio file
Transcribe, translate, and create a lip-synced avatar version of a video in a target language
Generate a portrait image first, then create an avatar video from that portrait with TTS
Best Practices
- Use high-quality, front-facing portrait photos with clear visibility of the face and good lighting for best results
- Generate audio with clear speech and minimal background noise before creating avatar videos
- Use P-Video-Avatar for the best balance of speed, cost, and quality - it includes built-in TTS and 1080p output
Avoid
- Do not use low-quality or heavily filtered portrait images - avatar lip sync quality depends on input image clarity
- Do not use audio with significant background noise - this degrades lip sync accuracy
- Do not skip the audio generation step when using models without built-in TTS (OmniHuman, PixVerse)