Researchers need access to large-scale single-cell genomics data for disease research and drug discovery. This skill provides programmatic access to 61 million cells from the CELLxGENE Census, enabling population-scale queries without downloading entire datasets.
Download the skill ZIP
Upload in Claude
Go to Settings → Capabilities → Skills → Upload skill
Toggle on and start using
Test it
Using "cellxgene-census". Find all T cells in lung tissue from COVID-19 patients
Expected outcome:
- Found 45,230 cells matching criteria:
- Cell types: CD4-positive T cell (18,200), CD8-positive T cell (12,450), regulatory T cell (8,230), NK T cell (6,350)
- Datasets: 12 datasets contributed data
- Top tissues: lung (45,230), lymph node (12,100), spleen (8,450)
Using "cellxgene-census". What genes are expressed in neurons?
Expected outcome:
- Query returned 2.1M neuron cells across 245 datasets
- Top expressed genes (mean expression):
- - SNAP25: 8.4
- - SYP: 7.2
- - MAP2: 6.8
- - NEUROD1: 5.9
- - ELavl3: 5.4
Security Audit
Low RiskAll 228 static findings are FALSE POSITIVEs. The scanner detected patterns in markdown documentation that are not actual security vulnerabilities. External command detections are backticks in code blocks. C2 keyword detections are the substring 'C2' in 'CELLxGENE'. Cryptographic algorithm detections are documentation patterns. System reconnaissance detections are the word 'reconnaissance' in documentation text. The skill is safe for publication.
Risk Factors
🌐 Network access (1)
Quality Score
What You Can Build
Explore Cell Types in a Tissue
Query the Census to discover all cell types present in a specific tissue, such as brain or lung, along with cell type frequencies.
Analyze Gene Expression Markers
Query expression levels of specific genes (CD4, CD8A, FOXP3) across cell types and diseases to identify marker genes.
Train Cell Type Classifiers
Use Census data with PyTorch to train machine learning models for cell type classification tasks.
Try These Prompts
Find all cells of type [CELL_TYPE] in the [TISSUE] tissue from the CELLxGENE Census. Return the cell count and metadata.
Query gene expression for [GENE1], [GENE2], and [GENE3] across all cell types in the [DISEASE] dataset. Show expression patterns.
Compare [CELL_TYPE] cells across [TISSUE1], [TISSUE2], and [TISSUE3] tissues. What genes are differentially expressed?
Create a training dataset from Census for [CELL_TYPE] classification. Include [COLUMNS] metadata and gene expression data.
Best Practices
- Always filter for is_primary_data == True to avoid duplicate cells in results
- Specify census_version explicitly for reproducible research
- Estimate query size before loading large datasets to prevent memory issues
Avoid
- Do not query without filters - always specify tissue, cell type, or disease criteria
- Do not load all Census data at once - use filters and column selection to reduce data transfer
- Do not ignore the is_primary_data flag - it prevents counting duplicate cells
Frequently Asked Questions
What is the CELLxGENE Census?
How is this different from scanpy or scvi-tools?
What organisms are available?
How do I filter queries effectively?
What if my query is too large for memory?
How do I ensure reproducible results?
Developer Details
Author
K-Dense-AILicense
Unknown
Repository
https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/cellxgene-censusRef
main
File structure