Skills ena-database
🧬

ena-database

Low Risk 🌐 Network access📁 Filesystem access

Query European Nucleotide Archive

Also available from: davila7

Researchers need efficient access to genomic data for analysis. This skill provides programmatic access to ENA through REST APIs and FTP for retrieving DNA/RNA sequences, FASTQ files, and genome assemblies by accession number.

Supports: Claude Codex Code(CC)
⚠️ 66 Poor
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "ena-database". Find assemblies for SARS-CoV-2 with assembly level chromosome

Expected outcome:

Found 5 assemblies matching criteria:
- ERR1234567: Complete Genome (MN908947.3)
- ERR2345678: Complete Genome (MW123456.1)
- ERR3456789: Assembly Level: chromosome

Access the sequences at: https://www.ebi.ac.uk/ena/browser/api/xml/[ACCESSION]

Using "ena-database". Get taxonomy for Escherichia coli

Expected outcome:

Taxonomy ID: 562
Scientific Name: Escherichia coli
Lineage: Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacterales; Enterobacteriaceae; Escherichia
Rank: Species

Using "ena-database". Search for RNA-seq experiments in study PRJEB12345

Expected outcome:

Found 12 RNA-seq experiments:
- ERX123456: Paired-end, ILLUMINA
- ERX123457: Single-end, ILLUMINA
- ERX123458: Paired-end, BGI

Run accessions available for FASTQ download.

Security Audit

Low Risk
v5 • 1/21/2026

This is a legitimate bioinformatics data access skill for querying the European Nucleotide Archive. All static findings are false positives. The 'external_commands' detections are backtick characters in documentation examples, not shell execution. 'Network' findings are HTTP requests to public ENA APIs (www.ebi.ac.uk). Critical/high severity flags (SAM database, C2 keywords, weak crypto) match generic terms in documentation (sample=sam, MD5/SHA1 for checksums). No actual security risks present.

3
Files scanned
3,646
Lines analyzed
2
findings
5
Total audits

Risk Factors

🌐 Network access (2)
📁 Filesystem access (1)
Audited by: claude View Audit History →

Quality Score

41
Architecture
90
Maintainability
87
Content
20
Community
90
Security
83
Spec Compliance

What You Can Build

Retrieve sequencing data for analysis

Download raw FASTQ files and genome assemblies by accession number for downstream bioinformatics analysis pipelines.

Search datasets by study or organism

Query the ENA Portal API to find all samples, runs, or assemblies associated with a specific study or taxonomic classification.

Build reproducible research workflows

Integrate ENA data retrieval into automated pipelines that fetch and cite specific accessions for reproducible genomics research.

Try These Prompts

Find samples in a study
Use the ENA Portal API to search for all samples in study PRJNA[STUDY_ID]. Return the accession numbers and sample titles.
Download sequence by accession
Retrieve the nucleotide sequence for accession [ACCESSION] in FASTA format using the ENA Browser API.
Find assemblies for organism
Find all assemblies for [ORGANISM] with contig N50 >= [N50_VALUE] using the ENA Portal API.
Bulk download workflow
Search for all read runs in study [STUDY_ID], extract the FTP URLs, and generate a script to download all files via FTP.

Best Practices

  • Implement rate limiting with exponential backoff to stay within 50 requests per second
  • Use FTP or Aspera for downloading files larger than 100MB
  • Cite study and sample accessions when publishing results derived from ENA data

Avoid

  • Making individual API calls for thousands of records instead of batching
  • Downloading large files via HTTP when FTP/Aspera is available
  • Failing to handle XML parsing errors from ENA Browser API responses

Frequently Asked Questions

What is the difference between the ENA Portal API and Browser API?
The Portal API is for advanced searching and filtering across all data types. The Browser API is for direct retrieval of specific records by accession in XML format.
How do I download large FASTQ files?
For files over 100MB, use FTP or Aspera protocols instead of HTTP. The ENA Portal API returns FTP URLs in search results.
What formats can I retrieve sequences in?
Sequences are available in FASTA (assembled sequences), FASTQ (raw reads), XML (native format), and TSV/JSON for metadata.
How do I find all data from a specific study?
Use the Portal API with query parameter study_accession=[STUDY_ID] and result type sample, read_run, or assembly as needed.
What are the rate limits for ENA APIs?
All ENA APIs are limited to 50 requests per second. Exceeding this returns HTTP 429 and requires implementing backoff.
How do I cite ENA data in my publication?
Include the Study/Project accession as the primary citation. Also list specific sample, run, or assembly accessions used in your analysis.

Developer Details

File structure