Skills modal
☁️

modal

Safe ⚙️ External commands🌐 Network access📁 Filesystem access🔑 Env variables

Run Python code in the cloud

Also available from: davila7

Modal is a serverless platform for running Python code in the cloud. It provides instant access to GPUs, automatic scaling, and pay-per-use billing. Deploy ML models, run batch processing jobs, and serve APIs without managing infrastructure.

Supports: Claude Codex Code(CC)
🥉 73 Bronze
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "modal". Deploy a Python function that summarizes text using a HuggingFace model on GPU

Expected outcome:

  • ✓ Created Modal app with L40S GPU access
  • ✓ Built container image with transformers and torch
  • ✓ Deployed web endpoint for text summarization
  • ✓ Endpoint available at https://your-app.modal.run

Using "modal". Run a batch job to process 1000 images in parallel

Expected outcome:

  • ✓ Created worker function with 4 CPU cores and 8GB memory
  • ✓ Configured parallel processing across 50 containers
  • ✓ Processed 1000 images in ~8 minutes
  • ✓ Results saved to Modal Volume at /data/output/

Using "modal". Schedule daily model retraining at midnight

Expected outcome:

  • ✓ Created scheduled function with cron expression '0 0 * * *'
  • ✓ Configured GPU (A100) for training computations
  • ✓ Set up secret management for API credentials
  • ✓ Training logs available in Modal dashboard

Security Audit

Safe
v4 • 1/17/2026

This is a documentation-only skill for Modal, a legitimate serverless cloud computing platform. All 572 static findings are FALSE POSITIVES. The scanner misinterprets Markdown documentation code examples as executable code. Patterns flagged include CLI commands in documentation (modal run, modal deploy), environment variable documentation, and legitimate Modal API patterns. No malicious code, credential exfiltration, or actual security vulnerabilities exist. This skill contains only documentation files teaching users how to properly use the Modal platform.

14
Files scanned
6,111
Lines analyzed
4
findings
4
Total audits
Audited by: claude View Audit History →

Quality Score

45
Architecture
100
Maintainability
87
Content
29
Community
100
Security
91
Spec Compliance

What You Can Build

Deploy ML models for inference

Deploy trained models (LLMs, image classifiers) to production with GPU acceleration and auto-scaling for variable traffic.

Run batch processing jobs

Process large datasets in parallel across multiple containers. Process thousands of files or data rows simultaneously.

Execute GPU compute tasks

Run computationally intensive research tasks on H100 or A100 GPUs. Schedule training jobs and long-running computations.

Try These Prompts

Basic GPU Deployment
Create a Modal app that runs a Python function on an L40S GPU. The function should load a HuggingFace model and return predictions. Use an appropriate container image with torch and transformers installed.
Batch Processing
Set up a Modal function that processes CSV files in parallel. The function should read files from an S3 bucket, apply transformations, and save results. Use CPU parallelism with multiple cores.
Scheduled Jobs
Create a Modal scheduled function that runs daily at 2 AM. The function should refresh cached data from an API and update model weights stored in a Modal Volume.
Web API
Build a Modal web endpoint that accepts POST requests with input data. The endpoint should run inference using a deployed model and return predictions. Include proper error handling and authentication.

Best Practices

  • Pin all Python package versions in image definitions to ensure reproducible builds and deployments
  • Use separate Modal Secrets for different environments (dev, staging, production) to prevent credential leakage
  • Configure appropriate min_containers to reduce cold start latency for latency-sensitive endpoints

Avoid

  • Hardcoding API keys or credentials directly in function code instead of using Modal Secrets
  • Importing heavy dependencies at module scope instead of inside function bodies, slowing down container startup
  • Using sequential loops for batch processing instead of .map() for parallel execution across containers

Frequently Asked Questions

How much does Modal cost?
Modal offers pay-per-use pricing. You only pay for compute time used. New users receive $30/month in free credits. GPU instances and larger containers cost more.
What GPU types are available?
Modal provides T4, L4, A10, A100, A100-80GB, L40S, H100, H200, and B200 GPUs. Different models offer various price-performance tradeoffs for inference versus training.
How do I authenticate with Modal?
Run 'modal token new' to open a browser login. This stores credentials in ~/.modal.toml. Alternatively, set MODAL_TOKEN_ID and MODAL_TOKEN_SECRET environment variables.
Can I run long-running jobs?
Yes, but default timeout is 5 minutes. Increase with timeout parameter up to 24 hours. For longer jobs, consider breaking work into chunks or using scheduled jobs.
How does autoscaling work?
Modal automatically scales containers from zero to max_containers based on incoming requests. Set min_containers to keep warm for low-latency endpoints. Use buffer_containers for burst handling.
What Python versions are supported?
Modal supports Python 3.8 through 3.12. Specify python_version in image definition. Python 3.11 or 3.12 recommended for best performance with ML workloads.