技能 pydeseq2

🧬

pydeseq2

Name: pydeseq2
Author: K-Dense-AI

安全 📁 文件系统访问🌐 网络访问

Analyze RNA-seq differential gene expression with PyDESeq2

也可从以下获取: davila7

PyDESeq2 enables differential gene expression analysis from bulk RNA-seq count data. Perform statistical testing, multiple comparison correction, and generate publication-ready volcano and MA plots for your genomics research.

支持: Claude Codex Code(CC)

🥉 74 青铜

下载技能 ZIP

在 Claude 中上传

前往设置 → 功能 → 技能 → 上传技能

开启并开始使用

测试它

正在使用“pydeseq2”。 Analyze my RNA-seq data and show top differentially expressed genes

预期结果:

Analysis complete. Found 847 significant genes (padj < 0.05)
Top upregulated genes:
- GeneX: log2FC = 4.2, padj = 1.3e-15
- GeneY: log2FC = 3.8, padj = 2.7e-12
- GeneZ: log2FC = 3.5, padj = 5.1e-11
Top downregulated genes:
- GeneA: log2FC = -3.9, padj = 8.2e-14
- GeneB: log2FC = -3.1, padj = 3.4e-10
Results saved to deseq2_results.csv

安全审计

安全

v4 • 1/17/2026

All 429 static findings are false positives. The 'weak cryptographic algorithm' flags incorrectly match 'DES' in 'DESeq2' (a statistical method name, not cryptography). The 'external_commands' flags misinterpret markdown code fences as shell execution. Filesystem access is standard data I/O for bioinformatics workflows. Network access involves only documentation URLs. This is a legitimate scientific computing skill with no malicious code.

已扫描文件

1,961

分析行数

发现项

审计总数

风险因素

📁 文件系统访问 (2)

scripts/run_deseq2_analysis.py:180-185 SKILL.md:211-213

🌐 网络访问 (1)

SKILL.md:553-556

审计者: claude 查看审计历史 →

质量评分

架构

100

可维护性

内容

社区

100

安全

规范符合性

你能构建什么

Compare treated vs control

Identify differentially expressed genes between experimental conditions using proper statistical testing and FDR correction for publication-ready results.

RNA-seq thesis analysis

Process RNA-seq count data, perform differential expression analysis, and generate publication-quality figures for thesis or research papers.

Batch RNA-seq processing

Automate differential expression analysis across multiple conditions or timepoints using the included command-line script.

试试这些提示

Basic DE analysis

Load my RNA-seq data from counts.csv and metadata.csv, then perform differential expression analysis comparing treated vs control samples using PyDESeq2

Multi-factor design

Analyze my RNA-seq data accounting for batch effects using design formula ~batch + condition, then test for treatment vs control differences

Generate visualizations

Run PyDESeq2 analysis on my data and create volcano and MA plots highlighting significant genes with padj < 0.05

Advanced filtering

Load RNA-seq data, filter genes with fewer than 20 total counts, use multi-factor design ~age + sex + condition, and identify genes with |log2FC| > 1 and padj < 0.01

最佳实践

Always transpose count matrix if genes are rows (use .T to get samples × genes format)
Filter low-count genes before analysis to improve statistical power
Use adjusted p-values (padj) not raw p-values for determining significance
Check that sample names match exactly between counts and metadata files

避免

Never use raw p-values for multiple testing - always use FDR-corrected padj values
Do not apply LFC shrinkage before statistical testing - use after for visualization only
Avoid complex multi-factor designs without sufficient sample size per condition
Never transpose metadata - only transpose count matrix if needed

常见问题

Why do I get an index mismatch error?

Sample names in counts and metadata files do not match. Ensure both files use identical sample identifiers in the same format.

Should I transpose my count matrix?

If your CSV has genes as rows and samples as columns, transpose with .T to get the required samples × genes format.

What is the difference between pvalue and padj?

pvalue is the raw statistical p-value; padj is the FDR-corrected value for multiple testing. Use padj < 0.05 for significance.

When should I use LFC shrinkage?

Apply LFC shrinkage after statistical testing for visualization, ranking genes, or creating heatmaps. Do not use for significance determination.

How do I handle batch effects in my analysis?

Include batch in your design formula as ~batch + condition. This controls for technical variation while testing biological differences.

Why are no genes significant in my analysis?

Check your sample size, effect sizes, and biological variability. Small studies or subtle effects may yield few significant genes.

开发者详情

作者

K-Dense-AI

许可证

MIT license

仓库

https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/pydeseq2

引用

main

文件结构

📁 references/

📄 api_reference.md

📄 workflow_guide.md

📁 scripts/

📄 run_deseq2_analysis.py

📄 SKILL.md