Skills scikit-bio

🧬

scikit-bio

Name: scikit-bio
Author: K-Dense-AI

Safe ⚙️ External commands🌐 Network access

Analyze biological data with scikit-bio

Also available from: davila7

Process biological sequences, calculate diversity metrics, and perform statistical tests on microbiome and ecological data. This skill provides comprehensive guidance for bioinformatics workflows including sequence alignment, phylogenetic analysis, and ordination.

Supports: Claude Codex Code(CC)

📊 69 Adequate

Download the skill ZIP

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

Toggle on and start using

Test it

Using "scikit-bio". Calculate diversity metrics from my OTU table

Expected outcome:

Read your BIOM table: Table.read('table.biom')
Calculate alpha diversity: alpha_diversity('shannon', counts, ids=sample_ids)
Calculate beta diversity: beta_diversity('braycurtis', counts, ids=sample_ids)
Run PERMANOVA: permanova(distance_matrix, grouping, permutations=999)

Using "scikit-bio". Build a phylogenetic tree from my sequences

Expected outcome:

Read sequences from FASTA: skbio.DNA.read('sequences.fasta')
Calculate distance matrix: seq1.distance(seq2) or use kmer_distance
Build tree with NJ: nj(distance_matrix)
Calculate Robinson-Foulds distance: tree.robinson_foulds(other_tree)

Security Audit

Safe

v4 • 1/17/2026

Documentation-only skill with no executable code. All 133 static findings are false positives: detected backticks are markdown code delimiters, C2 keywords are scientific abbreviations (PC1, CCA, RDA for ordination methods), weak crypto flags are biological substitution matrices (BLOSUM62 for protein alignments), and URLs are official documentation links. No command injection, network exfiltration, or malicious patterns exist.

Files scanned

1,393

Lines analyzed

findings

Total audits

Risk Factors

⚙️ External commands (5)

SKILL.md:44-61 SKILL.md:64-65 SKILL.md:78-81 references/api_reference.md:20-35 references/api_reference.md:39-52

🌐 Network access (1)

SKILL.md:432-434

Audited by: claude View Audit History →

Quality Score

Architecture

100

Maintainability

Content

Community

100

Security

Spec Compliance

What You Can Build

Analyze microbiome diversity

Calculate alpha and beta diversity from OTU tables and perform PERMANOVA testing on sample groupings.

Build phylogenetic trees

Construct trees from sequence alignments and calculate Robinson-Foulds distances for tree comparison.

Process sequence data

Read, filter, and transform biological sequences across 19+ file formats with validation.

Try These Prompts

Basic Sequence Operations

Show me how to read a FASTA file with skbio, calculate reverse complement, and find motifs using regex patterns.

Diversity Analysis

Guide me through calculating Shannon alpha diversity from a counts matrix and computing Bray-Curtis beta diversity between samples.

Phylogenetic Analysis

Help me build a phylogenetic tree using Neighbor Joining from a distance matrix and calculate patristic distances between taxa.

Statistical Testing

Show me how to run a PERMANOVA test on a distance matrix to determine if sample groups differ significantly, with 999 permutations.

Best Practices

Use generators (skbio.io.read) for large sequence files to avoid memory issues
Integrate with pandas and numpy for downstream analysis and visualization
Validate sequence IDs match across files before diversity calculations

Avoid

Do not use relative frequencies for counts - convert to integers first
Do not mix rooted and unrooted trees when calculating Robinson-Foulds distances
Do not skip PERMDISP when running PERMANOVA - check dispersion assumptions

Frequently Asked Questions

What file formats does scikit-bio support?

FASTA, FASTQ, GenBank, EMBL, Clustal, PHYLIP, Stockholm, Newick, BIOM (HDF5/JSON), and delimited matrices.

How do I calculate phylogenetic diversity?

Use alpha_diversity('faith_pd', counts, tree=tree, otu_ids=feature_ids) with a rooted phylogenetic tree.

What is the difference between alpha and beta diversity?

Alpha measures within-sample diversity (e.g., Shannon, Simpson), beta measures between-sample dissimilarity (e.g., Bray-Curtis, UniFrac).

Can I use scikit-bio with QIIME 2?

Yes, scikit-bio reads and writes QIIME 2 compatible formats including BIOM tables, trees, and distance matrices.

How do I handle large sequence files efficiently?

Use generator-based reading: for seq in skbio.io.read('large.fasta', format='fasta', constructor=skbio.DNA)

What statistical tests are available for ecological data?

PERMANOVA, ANOSIM, PERMDISP, Mantel test, and Bioenv for environmental variable selection.

Developer Details

Author

K-Dense-AI

License

BSD-3-Clause license

Repository

https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/scikit-bio

Ref

main

File structure

📁 references/

📄 api_reference.md

📄 SKILL.md