geo-database
Access NCBI GEO gene expression data
๋ํ ๋ค์์์ ์ฌ์ฉํ ์ ์์ต๋๋ค: davila7
Researchers need efficient access to gene expression datasets for analysis. This skill enables querying, downloading, and analyzing data from NCBI's GEO database containing millions of genomics samples.
์คํฌ ZIP ๋ค์ด๋ก๋
Claude์์ ์ ๋ก๋
์ค์ โ ๊ธฐ๋ฅ โ ์คํฌ โ ์คํฌ ์ ๋ก๋๋ก ์ด๋
ํ ๊ธ์ ์ผ๊ณ ์ฌ์ฉ ์์
ํ ์คํธํด ๋ณด๊ธฐ
"geo-database" ์ฌ์ฉ ์ค์ ๋๋ค. Search for diabetes gene expression datasets in humans
์์ ๊ฒฐ๊ณผ:
- Found 1,247 datasets matching 'diabetes AND Homo sapiens'
- Top results:
- - GSE12345: Type 2 diabetes gene expression (47 samples)
- - GSE67890: Diabetic nephropathy study (32 samples)
- - GSE11111: Insulin response time course (24 samples)
"geo-database" ์ฌ์ฉ ์ค์ ๋๋ค. Download GSE12345 and extract metadata
์์ ๊ฒฐ๊ณผ:
- Downloaded GSE12345_series_matrix.txt.gz (145 MB)
- Dataset summary:
- - Title: Transcriptome profiling of diabetic vs normal kidney
- - Samples: 20 (10 diabetic, 10 control)
- - Platform: GPL570 (Affymetrix Human Genome U133 Plus 2.0)
- - Organism: Homo sapiens
- - Submission date: 2023-06-15
๋ณด์ ๊ฐ์ฌ
๋ฎ์ ์ํDocumentation-only skill for accessing NCBI GEO database. Static analysis flagged 256 pattern-based issues but all are false positives. The 'backtick execution' findings are markdown code block syntax, not actual shell commands. Network operations are legitimate NCBI API access. FTP downloads target public GEO data repositories. Optional API key usage follows NCBI best practices. No executable code present - only documentation.
์ํ ์์ธ
๐ ๋คํธ์ํฌ ์ ๊ทผ (3)
โ๏ธ ์ธ๋ถ ๋ช ๋ น์ด (3)
๐ ํ์ผ ์์คํ ์ก์ธ์ค (1)
ํ์ง ์ ์
๋ง๋ค ์ ์๋ ๊ฒ
Analyze gene expression in disease
Download and compare gene expression data between healthy and diseased tissue samples to identify biomarkers.
Meta-analysis across studies
Combine data from multiple GEO studies to increase statistical power for detecting gene expression changes.
Build predictive models
Use GEO expression data to train machine learning models for drug response prediction or patient stratification.
์ด ํ๋กฌํํธ๋ฅผ ์ฌ์ฉํด ๋ณด์ธ์
Search GEO for human breast cancer gene expression datasets from the last 5 years. Show the top 5 results with sample counts and platforms used.
Download the expression matrix and metadata for GSE12345. Save the files to ./data/ and show a summary of the dataset including number of samples and genes.
Perform differential expression analysis on GSE12345 comparing treatment vs control samples. Use limma or t-test and show the top 10 most significant genes.
Download and process these 3 GEO series: GSE100001, GSE100002, GSE100003. Extract expression data and create a summary table with study metadata.
๋ชจ๋ฒ ์ฌ๋ก
- Always set your email when using NCBI E-utilities (required by NCBI policy)
- Obtain a free API key from NCBI for increased rate limits (10 req/s vs 3 req/s)
- Cache downloaded GEO files locally to avoid repeated downloads
ํผํ๊ธฐ
- Do not download entire GEO database - be selective with accessions
- Do not hardcode API keys in shared or version-controlled code
- Do not ignore sample metadata when interpreting expression data
์์ฃผ ๋ฌป๋ ์ง๋ฌธ
Do I need an API key for GEO access?
What is the difference between GSE, GSM, and GPL?
Why is expression data missing for some series?
How do I handle very large GEO datasets?
Can I use GEO data for clinical research?
What file format should I use for expression data?
๊ฐ๋ฐ์ ์ธ๋ถ ์ ๋ณด
์์ฑ์
K-Dense-AI๋ผ์ด์ ์ค
MIT
๋ฆฌํฌ์งํ ๋ฆฌ
https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/geo-database์ฐธ์กฐ
main
ํ์ผ ๊ตฌ์กฐ