scikit-learn
Safe 76Apply scikit-learn for ML models
by davila7
Build machine learning models quickly with scikit-learn guidance. Covers classification, regression, clustering, preprocessing, pipelines, and model evaluation with ready-to-use examples.
scanpy
Safe 80Analyze single-cell RNA-seq data with scanpy
by davila7
Single-cell RNA-seq analysis requires complex workflows for quality control, clustering, and visualization. This skill provides complete scanpy workflows including UMAP generation, Leiden clustering, marker gene identification, and cell type annotation.
pymc-bayesian-modeling
Safe 79Build Bayesian models with PyMC
by davila7
This skill provides tools for Bayesian statistical modeling using PyMC. It enables building hierarchical models, running MCMC sampling with NUTS, performing variational inference, and comparing models with LOO/WAIC metrics for principled uncertainty quantification.
polars
Safe 70Master Polars for High-Performance Data Analysis
by davila7
Pandas workflows are slow and memory-intensive on large datasets. This skill provides expert guidance on Polars, a lightning-fast DataFrame library built on Apache Arrow that delivers 10-100x performance improvements with lazy evaluation, parallel processing, and an intuitive expression API.
plotly
Safe 70Create interactive data visualizations with Plotly
by davila7
Creating charts and visualizations is time consuming. Plotly provides a Python library with 40+ chart types including scatter plots, heatmaps, 3D plots, and geographic maps. Generate publication-quality interactive visualizations and export to HTML or static images.
pdf-processing-pro
Low Risk 73Extract and process PDF documents
by davila7
Processing PDF documents manually takes too much time. This toolkit provides production-ready scripts for extracting text, handling forms, extracting tables, and performing OCR on scanned documents with batch processing support.
pdf-processing
Safe 69Extract and process PDF documents
by davila7
PDF documents contain valuable data but are difficult to process programmatically. This skill provides code patterns to extract text, tables, and form data from PDFs using Python libraries like pdfplumber and pypdf.
matplotlib
Low Risk 74Create scientific plots and charts
by davila7
Creating publication-quality visualizations in Python requires understanding matplotlib API, styling options, and best practices. This skill provides templates, code examples, and troubleshooting guidance for generating professional plots, charts, and 3D visualizations for research and data analysis.
matchms
Safe 70Analyze mass spectrometry data
by davila7
Mass spectrometry generates complex spectral data that requires specialized processing. Matchms provides a complete Python toolkit for loading, filtering, comparing, and identifying compounds from spectral data with established similarity metrics.
get-available-resources
Safe 71Detect system resources for scientific computing
by davila7
Scientific computing tasks require appropriate hardware resources to run efficiently. This skill automatically detects CPU cores, GPU availability, memory, and disk space to recommend optimal computational strategies and library choices.
geopandas
Safe 71Work with geospatial vector data for spatial analysis
by davila7
Analyzing geographic data requires specialized tools for handling vector geometries, coordinate systems, and spatial relationships. GeoPandas extends pandas to enable spatial operations on geometric types for efficient geospatial data manipulation.
fda-database
Low Risk 73Query FDA databases for regulatory data
by davila7
Access comprehensive FDA regulatory data including drugs, medical devices, food recalls, and substance information. Search adverse events, labeling, approvals, and recalls using the official openFDA API.
exploratory-data-analysis
Safe 82Analyze scientific data files
by davila7
Scientists need to understand the structure and quality of diverse scientific data files before analysis. This skill automatically detects file types, extracts metadata, performs statistical analysis, and generates comprehensive markdown reports for 200+ scientific formats.
excel-analysis
Safe 70Analyze Excel Spreadsheets with Pandas
by davila7
Manual Excel analysis takes hours of repetitive work. This skill provides ready-to-use pandas patterns for reading, analyzing, and visualizing spreadsheet data in seconds.
dnanexus-integration
Safe 70Build and Deploy DNAnexus Genomics Pipelines
by davila7
Managing genomics data and building analysis pipelines on DNAnexus requires learning complex APIs and patterns. This skill provides comprehensive guidance for app development, data management, and workflow execution on the DNAnexus cloud platform.
diffdock
Safe 81Predict protein-ligand binding poses with AI
by davila7
Predict 3D binding poses between proteins and small molecule ligands using state-of-the-art diffusion models. Generate confidence-scored predictions for structure-based drug discovery and virtual screening campaigns.
deeptools
Safe 78Analyze NGS data with deepTools
by davila7
Process next-generation sequencing data for ChIP-seq, RNA-seq, and ATAC-seq experiments. Convert BAM files to normalized coverage tracks and generate publication-quality visualizations including heatmaps, correlation plots, and profile graphs.
datamol
Safe 70Analyze molecules and compute drug properties with Python
by davila7
Working with molecular data in Python requires complex RDKit code. Datamol provides simple functions for SMILES parsing, property calculation, and compound analysis.
datacommons-client
Safe 71Query public statistics from Data Commons
by davila7
Accessing demographic, economic, and health data from multiple global sources requires navigating complex APIs. This skill provides complete guidance for using the Data Commons Python client to query population statistics, unemployment rates, GDP figures, and other public datasets through a unified knowledge graph.
dask
Safe 70Scale pandas and NumPy beyond memory with Dask
by davila7
Processing large datasets that exceed available RAM causes memory errors and slow performance. Dask provides parallel computing abstractions that scale pandas and NumPy operations to handle terabyte-scale data on laptops or clusters.