umap-learn
Safe 70Apply UMAP for dimensionality reduction
by davila7
High-dimensional data is difficult to visualize and analyze. UMAP provides fast nonlinear dimensionality reduction that preserves both local and global structure for clear 2D/3D visualizations and effective clustering preprocessing.
transformers
Safe 72Master Hugging Face Transformers for AI Development
by davila7
Working with transformer models requires understanding pipelines, tokenization, and fine-tuning workflows. This skill provides comprehensive guidance for using the Hugging Face Transformers library across NLP, computer vision, and audio tasks with best practices and code examples.
torch-geometric
Low Risk 73Build Graph Neural Networks with PyTorch Geometric
by davila7
Graph Neural Networks enable learning from irregular data structures like social networks and molecules. PyTorch Geometric provides a comprehensive toolkit for building, training, and evaluating GNNs with minimal boilerplate code.
string-database
Safe 74Query STRING protein interaction database
by davila7
Access protein-protein interaction networks from STRING database covering 59M proteins and 20B interactions. Perform functional enrichment analysis, discover interaction partners, and generate network visualizations for systems biology research.
statsmodels
Safe 70Perform statistical analysis with statsmodels
by davila7
Users need to analyze data with rigorous statistical methods. This skill provides comprehensive guidance for regression models, hypothesis testing, time series analysis, and diagnostic procedures.
shap
Safe 70Explain model predictions with SHAP
by davila7
Machine learning models often work as black boxes. SHAP provides a unified framework to understand which features drive predictions and how much each feature contributes. This skill helps you compute feature importance, generate visualizations, debug models, and implement explainable AI in your projects.
senior-ml-engineer
Low Risk 77Deploy production ML models with expert guidance
by davila7
Building and deploying ML systems to production requires deep expertise in MLOps, model monitoring, and scalable infrastructure. This skill provides world-class guidance for productionizing ML models, implementing RAG systems, and integrating LLMs into production workflows.
senior-data-scientist
Safe 77Build statistical models and experiments
by davila7
Design experiments, build predictive models, and drive data-driven decisions with expert-level data science techniques. This skill provides production-grade frameworks for statistical analysis, feature engineering, and model evaluation.
senior-data-engineer
Low Risk 75Build scalable data pipelines and ETL systems
by davila7
Design and implement production-grade data pipelines with senior-level expertise. Transform raw data into reliable, scalable analytics infrastructure using Python, SQL, Spark, and modern data stack tools.
senior-computer-vision
Low Risk 77Build production computer vision systems
by davila7
Computer vision projects require deep expertise in architectures, optimization, and deployment. This skill provides senior-level guidance for building object detection, segmentation, and real-time vision systems with production best practices.
seaborn
Safe 70Create Statistical Visualizations with Seaborn
by davila7
Seaborn provides complex statistical visualizations with minimal code. This skill helps you create publication-quality charts for exploratory data analysis and presentation.
scikit-survival
Safe 71Perform survival analysis with scikit-survival
by davila7
Time-to-event analysis requires specialized methods for censored data. This skill provides complete guidance on fitting Cox models, Random Survival Forests, and Survival SVMs while properly evaluating predictions with concordance index and Brier score.
scikit-learn
Safe 76Apply scikit-learn for ML models
by davila7
Build machine learning models quickly with scikit-learn guidance. Covers classification, regression, clustering, preprocessing, pipelines, and model evaluation with ready-to-use examples.
scanpy
Safe 80Analyze single-cell RNA-seq data with scanpy
by davila7
Single-cell RNA-seq analysis requires complex workflows for quality control, clustering, and visualization. This skill provides complete scanpy workflows including UMAP generation, Leiden clustering, marker gene identification, and cell type annotation.
pymc-bayesian-modeling
Safe 80Build Bayesian models with PyMC
by davila7
This skill provides tools for Bayesian statistical modeling using PyMC. It enables building hierarchical models, running MCMC sampling with NUTS, performing variational inference, and comparing models with LOO/WAIC metrics for principled uncertainty quantification.
polars
Safe 70Master Polars for High-Performance Data Analysis
by davila7
Pandas workflows are slow and memory-intensive on large datasets. This skill provides expert guidance on Polars, a lightning-fast DataFrame library built on Apache Arrow that delivers 10-100x performance improvements with lazy evaluation, parallel processing, and an intuitive expression API.
plotly
Safe 70Create interactive data visualizations with Plotly
by davila7
Creating charts and visualizations is time consuming. Plotly provides a Python library with 40+ chart types including scatter plots, heatmaps, 3D plots, and geographic maps. Generate publication-quality interactive visualizations and export to HTML or static images.
pdf-processing-pro
Low Risk 74Extract and process PDF documents
by davila7
Processing PDF documents manually takes too much time. This toolkit provides production-ready scripts for extracting text, handling forms, extracting tables, and performing OCR on scanned documents with batch processing support.
pdf-processing
Safe 69Extract and process PDF documents
by davila7
PDF documents contain valuable data but are difficult to process programmatically. This skill provides code patterns to extract text, tables, and form data from PDFs using Python libraries like pdfplumber and pypdf.
matplotlib
Low Risk 75Create scientific plots and charts
by davila7
Creating publication-quality visualizations in Python requires understanding matplotlib API, styling options, and best practices. This skill provides templates, code examples, and troubleshooting guidance for generating professional plots, charts, and 3D visualizations for research and data analysis.