scvi-tools
Analyze single-cell genomics data
Also available from: davila7
Single-cell genomics analysis requires probabilistic models to handle technical noise and batch effects. scvi-tools provides deep generative models using variational inference for dimensionality reduction, batch correction, differential expression, and multi-modal integration of single-cell data.
Download the skill ZIP
Upload in Claude
Go to Settings → Capabilities → Skills → Upload skill
Toggle on and start using
Test it
Using "scvi-tools". Help me set up scVI for batch correction on my single-cell dataset with two batches.
Expected outcome:
```python
import scvi
import scanpy as sc
# Preprocess data
adata = sc.read_h5ad("my_single_cell_data.h5ad")
sc.pp.filter_genes(adata, min_counts=3)
# Register data with batch information
scvi.model.SCVI.setup_anndata(
adata,
layer="counts",
batch_key="batch"
)
# Train model
model = scvi.model.SCVI(adata)
model.train()
# Get batch-corrected latent representation
latent = model.get_latent_representation()
adata.obsm["X_scVI"] = latent
# Downstream analysis
sc.pp.neighbors(adata, use_rep="X_scVI")
sc.tl.umap(adata)
sc.tl.leiden(adata)
```
Using "scvi-tools". How do I identify marker genes between two cell types in my scVI model?
Expected outcome:
```python
# Differential expression between two groups
de_results = model.differential_expression(
groupby="leiden",
group1="0", # Cluster 0
group2="1", # Cluster 1
mode="change",
delta=0.25 # Minimum effect size
)
# View top differentially expressed genes
print(de_results.head(20))
# Filter for significant genes
significant_genes = de_results[
(de_results['is_de_fdr_0.05']) &
(de_results['bayes_factor'] > 1)
]
print(f"Found {len(significant_genes)} differentially expressed genes")
```
Security Audit
Low RiskThis is a documentation-only skill containing markdown reference files for scvi-tools, a legitimate Python library for single-cell genomics analysis. All 399 static findings are false positives caused by incorrect pattern matching: Python code examples in documentation were flagged as shell commands, bioinformatics statistical terms were misidentified as cryptographic algorithms, and documentation URLs were flagged as hardcoded URLs. No executable code or malicious patterns exist. Safe for publication.
Risk Factors
🌐 Network access (1)
📁 Filesystem access
Quality Score
What You Can Build
Batch correction for integrated single-cell analysis
Remove technical batch effects from single-cell RNA-seq datasets across multiple donors, protocols, or sequencing runs using scVI to create unified, integrated cell atlases.
Differential expression with uncertainty
Identify differentially expressed genes between cell types or conditions with probabilistic uncertainty estimates, providing more reliable statistical conclusions for downstream validation.
Multi-modal data integration
Jointly analyze paired RNA and protein measurements (CITE-seq) or chromatin accessibility data to discover cell populations with enhanced biological resolution.
Try These Prompts
Help me set up scvi-tools to analyze my single-cell RNA-seq data. I have an AnnData object with raw count data and want to perform batch correction. Show me how to register the data, train the model, and extract latent representations.
I trained an scVI model on my single-cell dataset with cell type annotations. Help me identify differentially expressed genes between two cell types (e.g., cluster A vs cluster B) using the differential_expression method. Include how to interpret the results and set effect size thresholds.
I have paired CITE-seq data with RNA counts and protein antibody-derived counts. Help me set up totalVI to jointly model both modalities, train the model, and extract joint latent representations that capture both RNA and protein variation.
I have a single-cell reference dataset with cell type annotations and a spatial transcriptomics dataset with spot-level counts. Help me use DestVI or Stereoscope to deconvolve cell types in the spatial data and create cell type proportion maps.
Best Practices
- Always provide raw, unnormalized count data to scvi-tools models for accurate probabilistic modeling
- Register all known technical covariates (batch, donor, protocol) during setup to improve batch correction
- Save trained models regularly using model.save() to avoid retraining on large datasets
- Use GPU acceleration (accelerator="gpu") when training on datasets with more than 50,000 cells
Avoid
- Do not use log-normalized data as input - scvi-tools models expect raw count data
- Do not skip data filtering (low-count genes/cells) before training as it affects model quality
- Do not interpret latent representations without validation against known biological markers
- Do not use scvi-tools for bulk RNA-seq analysis - it is designed specifically for single-cell data
Frequently Asked Questions
What is the difference between scVI and scanpy for single-cell analysis?
Do I need a GPU to use scvi-tools?
What data formats does scvi-tools support?
How do I choose between scVI, scANVI, and totalVI?
What does the differential_expression method return?
How do I save and load trained scvi-tools models?
Developer Details
Author
K-Dense-AILicense
BSD-3-Clause license
Repository
https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/scvi-toolsRef
main