Skills image-to-video
🎬

image-to-video

Low Risk 🌐 Network access⚙️ External commands

Transform static images into AI-animated videos

Also available from: inference-sh-9,doany-ai,agentspace-so

Creating videos from images requires selecting the right AI model for your use case. This skill intelligently routes your request to HappyHorse for portraits, Wan 2.7 for lip-sync, or Seedance for multi-modal animations.

Supports: Claude Codex Code(CC)
📊 70 Adequate
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "image-to-video". Animate this portrait photo with a slow dolly-in camera movement

Expected outcome:

A 5-second video clip of the portrait with smooth camera push-in, subtle facial expression changes, identity-preserved features, and soft natural lighting evolution.

Using "image-to-video". Create a talking-head video synced to this voiceover [attached audio file]

Expected outcome:

A 12-second vertical video of a spokesperson with precise lip-sync to the provided audio, medium close-up framing, and professional studio lighting.

Using "image-to-video". Make a brand video using this character image, this scene clip, and this voice sample

Expected outcome:

An 8-second video where the character from the image performs actions from the scene reference while speaking with the voice from the audio reference, maintaining visual consistency.

Security Audit

Low Risk
v1 • 5/31/2026

All 117 static findings are false positives. The skill is a markdown documentation file explaining how to use the RunComfy CLI for AI video generation. External command detections are example bash commands in fenced code blocks, not actual shell executions. Path traversal and filesystem detections are placeholder tokens and legitimate config paths. Network detections are service URLs, not data exfiltration. Security measures are properly documented with token storage at mode 0600, HTTPS transmission, and download size limits.

1
Files scanned
206
Lines analyzed
7
findings
1
Total audits
Medium Risk Issues (1)
API Token Storage on Filesystem
The RunComfy CLI stores API tokens in ~/.config/runcomfy/token.json with mode 0600 permissions. While this is a legitimate pattern for CLI tools, it does involve filesystem access for credential storage.
Low Risk Issues (4)
Hardcoded Service URLs
Documentation contains hardcoded URLs to RunComfy service endpoints (runcomfy.com, model-api.runcomfy.net). These are legitimate service references, not data exfiltration.
False Positive: External Command Detection in Documentation
The 91 'external_commands' detections are all in fenced bash code blocks (```bash) within the markdown file. These are example CLI commands in documentation, not actual shell executions.
False Positive: Path Traversal Detection on Placeholder Tokens
Path traversal sequences were detected in <absolute/path> and similar placeholder tokens. These are documentation placeholders showing user where to supply their own paths, not actual path traversal vulnerabilities.
False Positive: Weak Cryptographic Algorithm Detection
The static scanner flagged 'weak cryptographic algorithm' on patterns like '0600' (file permissions) and '1.0' (version numbers). These are false positives triggered by numeric patterns in documentation.

Risk Factors

🌐 Network access (12)
⚙️ External commands (91)
Audited by: claude

Quality Score

38
Architecture
100
Maintainability
87
Content
50
Community
77
Security
91
Spec Compliance

What You Can Build

Animate product photos for e-commerce

Bring product images to life with smooth camera movements and subtle animations. HappyHorse preserves product geometry and packaging while adding cinematic motion.

Create talking-head videos with custom voiceovers

Generate spokesperson videos with lip-sync to your pre-recorded audio tracks. Perfect for localized marketing content with voice talent in any language.

Produce brand-consistent narrative videos

Combine a character image, scene from reference video, and voice from reference audio into a unified clip. Maintains identity consistency across brand assets.

Try These Prompts

Simple portrait animation
Animate this portrait image with gentle camera movement around the subject's face, subtle breathing motion, and soft natural lighting.
Product showcase animation
Create a 360-style reveal animation for this product shot with smooth orbital camera movement, maintaining packaging and branding integrity.
Voiceover lip-sync video
Generate a spokesperson clip for 9:16 format that syncs lip movements to this audio track: [attach audio], with medium close-up framing and warm lighting.
Multi-modal brand video
Create a branded narrative clip using [subject image], [scene video reference], and [voice audio reference]. Subject from image 1 walks through the scene from video 1 with voice from audio 1.

Best Practices

  • Lead prompts with motion verbs: drift, orbit, dolly in, reveal, blink. Front-load what is moving in the scene.
  • Match duration to audio length when using lip-sync. The video will be silent past the audio duration if the clip is longer.
  • Use explicit preservation keywords: identity-stable, packaging unchanged, background geometry stable when you need specific elements to remain consistent.

Avoid

  • Do not restate the image content in prompts; the AI model sees the image. Focus tokens on what should change or animate.
  • Do not request HappyHorse animation plus Wan-style lip-sync in a single call; these require separate model calls.
  • Do not mix radically different aesthetics between reference image and reference video; output will drift from intended style.

Frequently Asked Questions

Which model should I use for portrait animation?
HappyHorse 1.0 I2V is recommended for portrait animation as it ranks #1 on the Artificial Analysis Arena and excels at facial fidelity and identity preservation.
How do I add custom voiceover to a video?
Use Wan 2.7 with the audio_url parameter. Provide a WAV or MP3 file between 3-30 seconds and under 15MB. The model will drive lip-sync to match your audio.
Can I combine an image, reference video, and audio in one video?
Yes, use Seedance 2.0 Pro which accepts up to 9 image references, 3 video references (2-15s each), and 3 audio references for multi-modal composition.
What is the maximum video duration?
All models have a 15-second maximum duration. For longer content, you will need to generate multiple clips and stitch them together.
How do I ensure my brand logo stays consistent?
Use preservation keywords in your prompt such as branding unchanged, logo intact, or packaging stable. HappyHorse and Seedance both respond well to explicit preservation goals.
What image formats are supported?
JPEG, JPG, PNG, and WEBP formats are supported. Images must be at least 300 pixels in dimension, under 10MB, and have an aspect ratio between 1:2.5 and 2.5:1.

Developer Details

File structure

📄 SKILL.md