Skills lamindb
🧬

lamindb

Safe ⚙️ External commands📁 Filesystem access🌐 Network access🔑 Env variables

Manage biological data with LaminDB

Also available from: davila7

Biological research generates complex datasets that are difficult to track, query, and reproduce. LaminDB provides a unified framework for managing biological data with automatic lineage tracking, ontology-based annotations, and seamless integration with workflow managers.

Supports: Claude Codex Code(CC)
📊 71 Adequate
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "lamindb". How do I track my notebook analysis with LaminDB?

Expected outcome:

  • Use ln.track() at the start of your notebook to begin lineage capture
  • Import your data and perform analysis as normal
  • Call ln.finish() to complete tracking when done
  • View lineage with artifact.view_lineage() to see data provenance

Using "lamindb". Can you help me validate my experimental metadata?

Expected outcome:

  • Define a schema with required columns and data types
  • Create a DataFrameCurator or AnnDataCurator with your schema
  • Use curator.validate() to check data integrity
  • Use .cat.standardize() to fix typos and map synonyms

Using "lamindb". How do I connect LaminDB to my cloud storage?

Expected outcome:

  • Install extras: pip install 'lamindb[aws]' or 'lamindb[gcp]'
  • Configure storage: lamin init --storage s3://your-bucket
  • Set credentials via environment variables or config files
  • LaminDB handles caching and sync automatically

Security Audit

Safe
v4 • 1/17/2026

This is a pure documentation skill containing only markdown files with code examples for LaminDB biological data management. All 607 static findings are false positives. The analyzer incorrectly flagged markdown code formatting (backticks, code blocks), documentation about cloud storage configuration (AWS, GCP credentials), and library usage patterns (ln.Artifact) as security issues. No executable code, scripts, credential harvesting, or malicious patterns exist.

9
Files scanned
6,559
Lines analyzed
4
findings
4
Total audits
Audited by: claude View Audit History →

Quality Score

45
Architecture
100
Maintainability
87
Content
21
Community
100
Security
91
Spec Compliance

What You Can Build

Annotate scRNA-seq data

Validate and standardize cell type annotations using controlled vocabularies from Cell Ontology

Build data lakehouses

Create unified query interfaces across multiple biological datasets with automatic versioning

Track model lineage

Link training data artifacts to MLflow or W&B experiments for full reproducibility

Try These Prompts

Get started
Help me set up LaminDB locally. I want to install it, authenticate, and initialize a local instance for managing my single-cell datasets.
Annotate data
I have scRNA-seq data with cell type labels. Show me how to validate and standardize these labels using the Cell Ontology via Bionty.
Track lineage
I run Nextflow pipelines for bulk RNA-seq analysis. Show me how to integrate LaminDB to track which code produced which output files.
Query data
I have hundreds of Parquet files organized by experiment and batch. Show me how to query all artifacts from project X with tissue=PBMC and condition=treated without loading all files.

Best Practices

  • Start every analysis notebook with ln.track() and end with ln.finish() for automatic lineage capture
  • Define schemas and validate data early to catch issues before extensive analysis
  • Use hierarchical artifact keys like 'project/experiment/batch/file.h5ad' for organization

Avoid

  • Creating new artifact keys for modified versions instead of using built-in versioning
  • Loading large datasets without filtering first - query metadata first to reduce I/O
  • Skipping ontology standardization which leads to inconsistent queries across similar terms

Frequently Asked Questions

What data formats does LaminDB support?
LaminDB supports DataFrames (Parquet, CSV), AnnData (single-cell), MuData (multi-modal), SpatialData, and TileDB-SOMA arrays.
Do I need a server to use LaminDB?
No. LaminDB works locally with SQLite for development. Scale to cloud storage with PostgreSQL for production teams.
How does LaminDB integrate with Nextflow?
Use ln.track() in process scripts to record inputs and outputs. LaminDB captures provenance automatically for each step.
What biological ontologies are available?
Genes (Ensembl), Proteins (UniProt), Cell types (CL), Tissues (Uberon), Diseases (Mondo), Phenotypes (HPO), and Pathways (GO).
Can I use LaminDB without internet?
Yes for local operations. Initial ontology downloads and cloud storage access require internet. Cache ontologies locally for offline use.
How is LaminDB different from a database?
LaminDB combines database features (querying, filtering) with versioned file storage and lineage tracking specialized for scientific data workflows.