Skills scikit-bio
🧬

scikit-bio

Safe ⚙️ External commands🌐 Network access

Analyze biological data with scikit-bio

Also available from: davila7

Process biological sequences, calculate diversity metrics, and perform statistical tests on microbiome and ecological data. This skill provides comprehensive guidance for bioinformatics workflows including sequence alignment, phylogenetic analysis, and ordination.

Supports: Claude Codex Code(CC)
📊 69 Adequate
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "scikit-bio". Calculate diversity metrics from my OTU table

Expected outcome:

  • Read your BIOM table: Table.read('table.biom')
  • Calculate alpha diversity: alpha_diversity('shannon', counts, ids=sample_ids)
  • Calculate beta diversity: beta_diversity('braycurtis', counts, ids=sample_ids)
  • Run PERMANOVA: permanova(distance_matrix, grouping, permutations=999)

Using "scikit-bio". Build a phylogenetic tree from my sequences

Expected outcome:

  • Read sequences from FASTA: skbio.DNA.read('sequences.fasta')
  • Calculate distance matrix: seq1.distance(seq2) or use kmer_distance
  • Build tree with NJ: nj(distance_matrix)
  • Calculate Robinson-Foulds distance: tree.robinson_foulds(other_tree)

Security Audit

Safe
v4 • 1/17/2026

Documentation-only skill with no executable code. All 133 static findings are false positives: detected backticks are markdown code delimiters, C2 keywords are scientific abbreviations (PC1, CCA, RDA for ordination methods), weak crypto flags are biological substitution matrices (BLOSUM62 for protein alignments), and URLs are official documentation links. No command injection, network exfiltration, or malicious patterns exist.

3
Files scanned
1,393
Lines analyzed
2
findings
4
Total audits
Audited by: claude View Audit History →

Quality Score

41
Architecture
100
Maintainability
87
Content
20
Community
100
Security
83
Spec Compliance

What You Can Build

Analyze microbiome diversity

Calculate alpha and beta diversity from OTU tables and perform PERMANOVA testing on sample groupings.

Build phylogenetic trees

Construct trees from sequence alignments and calculate Robinson-Foulds distances for tree comparison.

Process sequence data

Read, filter, and transform biological sequences across 19+ file formats with validation.

Try These Prompts

Basic Sequence Operations
Show me how to read a FASTA file with skbio, calculate reverse complement, and find motifs using regex patterns.
Diversity Analysis
Guide me through calculating Shannon alpha diversity from a counts matrix and computing Bray-Curtis beta diversity between samples.
Phylogenetic Analysis
Help me build a phylogenetic tree using Neighbor Joining from a distance matrix and calculate patristic distances between taxa.
Statistical Testing
Show me how to run a PERMANOVA test on a distance matrix to determine if sample groups differ significantly, with 999 permutations.

Best Practices

  • Use generators (skbio.io.read) for large sequence files to avoid memory issues
  • Integrate with pandas and numpy for downstream analysis and visualization
  • Validate sequence IDs match across files before diversity calculations

Avoid

  • Do not use relative frequencies for counts - convert to integers first
  • Do not mix rooted and unrooted trees when calculating Robinson-Foulds distances
  • Do not skip PERMDISP when running PERMANOVA - check dispersion assumptions

Frequently Asked Questions

What file formats does scikit-bio support?
FASTA, FASTQ, GenBank, EMBL, Clustal, PHYLIP, Stockholm, Newick, BIOM (HDF5/JSON), and delimited matrices.
How do I calculate phylogenetic diversity?
Use alpha_diversity('faith_pd', counts, tree=tree, otu_ids=feature_ids) with a rooted phylogenetic tree.
What is the difference between alpha and beta diversity?
Alpha measures within-sample diversity (e.g., Shannon, Simpson), beta measures between-sample dissimilarity (e.g., Bray-Curtis, UniFrac).
Can I use scikit-bio with QIIME 2?
Yes, scikit-bio reads and writes QIIME 2 compatible formats including BIOM tables, trees, and distance matrices.
How do I handle large sequence files efficiently?
Use generator-based reading: for seq in skbio.io.read('large.fasta', format='fasta', constructor=skbio.DNA)
What statistical tests are available for ecological data?
PERMANOVA, ANOSIM, PERMDISP, Mantel test, and Bioenv for environmental variable selection.

Developer Details

File structure