Skills anndata

🔬

anndata

Name: anndata
Author: K-Dense-AI

Safe ⚙️ External commands🌐 Network access

Work with AnnData matrices

Also available from: davila7

AnnData provides a standardized data structure for annotated matrices used in single-cell genomics. This skill enables creating, reading, writing, and manipulating .h5ad files with full support for metadata, embeddings, and the scverse ecosystem.

Supports: Claude Codex Code(CC)

📊 71 Adequate

Download the skill ZIP

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

Toggle on and start using

Test it

Using "anndata". How do I read a 10X Genomics H5 file and convert it to AnnData?

Expected outcome:

Use ad.read_10x_h5() to read the H5 format directly
The function handles gene and barcode extraction automatically
Optional genome parameter for selecting specific reference when multiple are present

Using "anndata". What is backed mode and when should I use it?

Expected outcome:

Backed mode keeps data on disk and loads only accessed portions
Use it for datasets larger than available RAM to avoid out-of-memory errors
Access metadata and create subsets without loading entire file into memory

Security Audit

Safe

v4 • 1/17/2026

All 397 static findings are FALSE POSITIVES. This skill contains only markdown documentation with Python code examples. The static scanner incorrectly flags backticks in fenced code blocks, URLs in documentation links, and generic programming terms. No executable code, network operations, or credential handling exists. This is a legitimate scientific computing documentation skill for the AnnData Python library.

Files scanned

4,567

Lines analyzed

findings

Total audits

Risk Factors

⚙️ External commands (2)

references/best_practices.md:1-526 SKILL.md:1-399

🌐 Network access (2)

SKILL.md:394-397 references/io_operations.md:283-295

Audited by: claude View Audit History →

Quality Score

Architecture

100

Maintainability

Content

Community

100

Security

Spec Compliance

What You Can Build

Single-cell RNA-seq analysis

Load and process 10X Genomics data for single-cell transcriptomics research with proper metadata tracking.

Multi-batch data integration

Combine multiple experimental batches with automatic batch label tracking and conflict resolution.

Deep learning integration

Export data to PyTorch DataLoaders for training neural networks on single-cell expression data.

Try These Prompts

Create AnnData object

Create an AnnData object from a numpy array with observation metadata for cell types and sample IDs.

Read H5AD file

Read an H5AD file in backed mode and filter for high-quality cells based on a quality_score column.

Concatenate batches

Concatenate three AnnData objects along the observation axis with batch labels and inner join.

Optimize memory

Show how to convert string columns to categorical and use sparse matrices for memory efficiency.

Best Practices

Use backed mode (backed='r') for datasets larger than available RAM to avoid out-of-memory errors.
Convert string columns to categorical with strings_to_categoricals() for 10-50x memory reduction.
Store raw data with adata.raw = adata.copy() before filtering to preserve access to unfiltered genes.

Avoid

Avoid modifying views directly without copying first, as changes may affect the original object.
Do not load entire large datasets into memory when backed mode can provide lazy access.
Avoid index misalignment when adding external metadata by using set_index() and loc[].join().

Frequently Asked Questions

What is the difference between backed mode and in-memory mode?

Backed mode keeps data on disk and loads only accessed portions, enabling work with datasets larger than RAM.

How do I combine multiple AnnData objects for different modalities like RNA and protein?

Use Muon (MuData) to combine multiple AnnData objects for different modalities like RNA and protein.

When should I use sparse matrices?

Use sparse matrices when data has more than 50% zeros, common in single-cell count data.

How do I track which batch each cell came from?

Use the label and keys parameters in ad.concat() to add a batch column automatically.

What is the raw attribute for?

raw stores a snapshot of data before filtering, allowing access to original unfiltered genes later.

How do I handle out of memory errors?

Use backed mode, convert to sparse matrices, convert strings to categoricals, or process in chunks.

Developer Details

Author

K-Dense-AI

License

BSD-3-Clause license

Repository

https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/anndata

Ref

main

File structure

📁 references/

📄 best_practices.md

📄 concatenation.md

📄 data_structure.md

📄 io_operations.md

📄 manipulation.md

📄 SKILL.md