Skills exploratory-data-analysis
📊

exploratory-data-analysis

Safe 📁 Filesystem access

Analyze Scientific Data Files Automatically

Also available from: davila7

Scientific data files come in hundreds of formats. This skill automatically detects file type, extracts metadata, assesses data quality, and generates comprehensive markdown reports with format-specific analysis recommendations.

Supports: Claude Codex Code(CC)
🥈 80 Silver
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "exploratory-data-analysis". Analyze data/sample.fastq

Expected outcome:

  • File: sample.fastq (24.5 MB)
  • Format: FASTQ (sequence data with quality scores)
  • Sampled 10,000 reads: Mean length 150bp, Mean quality: 35.2
  • GC Content: 52.3%
  • Quality Assessment: High-quality data, suitable for downstream analysis
  • Recommendations: Proceed with alignment; no trimming required

Using "exploratory-data-analysis". Explore experiment_results.csv

Expected outcome:

  • File: experiment_results.csv (1.2 MB)
  • Format: CSV (tabular data)
  • Dimensions: 5,000 rows x 12 columns
  • Missing Values: 2.3% in column 'temperature'
  • Statistics: Mean=45.2, Std=12.8, Range=[-5.2, 98.4]
  • Recommendations: Impute missing values; check for outliers in temperature column

Security Audit

Safe
v4 • 1/17/2026

After thorough evaluation of 1077 static findings, all are false positives. The scanner misinterpreted Markdown code formatting (backticks) as shell commands, bioinformatics format names (SAM) as Windows credentials, and documentation references to file format specifications as weak cryptography. The skill is a legitimate scientific data analysis tool that only reads data files and writes markdown reports. No network access, no command execution, and no sensitive data handling were found.

10
Files scanned
8,669
Lines analyzed
1
findings
4
Total audits

Risk Factors

📁 Filesystem access (1)
Audited by: claude View Audit History →

Quality Score

82
Architecture
100
Maintainability
85
Content
21
Community
100
Security
91
Spec Compliance

What You Can Build

Explore genomic sequencing data

Analyze FASTQ, BAM, and VCF files to understand sequence quality, mapping rates, and variant distributions.

Examine molecular structure files

Parse PDB, SDF, and CIF files to assess molecular structures, atomic coordinates, and bond information.

Inspect microscopy image metadata

Extract dimensions, channels, timestamps, and spatial calibration from TIFF, ND2, and CZI imaging files.

Try These Prompts

Basic analysis
Analyze this scientific data file at path: <filepath>
With report
Generate a comprehensive EDA report for this file and save it to <filepath>
Quality focus
Perform a data quality assessment on this file and identify any issues or anomalies.
Multi-file
Analyze these multiple related files and create a summary comparison report.

Best Practices

  • Provide the full file path when requesting analysis for precise detection
  • Specify output filename to generate a persistent markdown report
  • Check required Python libraries for specialized formats before analysis

Avoid

  • Do not ask the skill to modify or write back to source data files
  • Do not expect the skill to perform advanced statistical modeling
  • Do not assume the skill can interpret biological meaning from sequences

Frequently Asked Questions

What file formats are supported?
200+ formats including FASTQ, BAM, VCF, PDB, CIF, TIFF, ND2, CSV, HDF5, and many more.
Does this modify my data files?
No, the skill only reads files and generates new markdown reports without altering original data.
What does the generated report include?
File metadata, format details, statistical summaries, quality metrics, and downstream analysis recommendations.
Can it analyze large files?
Yes, but very large files may be sampled for performance. The report notes when sampling is used.
What Python libraries are required?
Core libraries: pandas, numpy. Format-specific: Biopython for sequences, h5py for HDF5, Pillow for images.
Can it analyze multiple files together?
Each file is analyzed separately. You can request comparisons across related files in the same analysis request.