💊

pytdc

آمن ⚙️ الأوامر الخارجية🌐 الوصول إلى الشبكة

Access Drug Discovery Datasets with PyTDC

متاح أيضًا من: davila7

Drug discovery researchers need standardized datasets for training ML models. PyTDC provides curated ADME, toxicity, and drug-target interaction datasets with proper train-test splits and evaluation oracles.

يدعم: Claude Codex Code(CC)
🥉 76 برونزي
1

تنزيل ZIP المهارة

2

رفع في Claude

اذهب إلى Settings → Capabilities → Skills → Upload skill

3

فعّل وابدأ الاستخدام

اختبرها

استخدام "pytdc". Load the AMES toxicity dataset and show me the data format

النتيجة المتوقعة:

  • Dataset loaded with 7,255 compounds for mutagenicity prediction
  • Columns include Drug_ID, Drug (SMILES), and Y (binary toxicity label)
  • Scaffold split applied: 5,078 train, 725 validation, 1,452 test molecules

استخدام "pytdc". Evaluate this molecule with the GSK3B oracle: CC(C)Cc1ccc(cc1)C(C)C(O)=O

النتيجة المتوقعة:

  • GSK3B binding score: 0.0234 (low predicted affinity)
  • This SMILES represents ibuprofen, not expected to inhibit GSK3B
  • Scores range from 0 to 1, with higher values indicating stronger predicted binding

التدقيق الأمني

آمن
v4 • 1/17/2026

This skill provides documentation and templates for PyTDC, a legitimate drug discovery dataset library. All 427 static findings are false positives caused by markdown code blocks containing Python examples (detected as shell backticks), scientific terminology (DRD2, GSK3B detected as C2 keywords), and molecular/cryptographic naming overlaps. No actual security risks present.

9
الملفات التي تم فحصها
3,184
الأسطر التي تم تحليلها
2
النتائج
4
إجمالي عمليات التدقيق

عوامل الخطر

⚙️ الأوامر الخارجية (339)
EVALUATION_OUTPUT.json:24 references/datasets.md:10 references/datasets.md:11 references/datasets.md:12 references/datasets.md:13 references/datasets.md:14 references/datasets.md:15 references/datasets.md:16 references/datasets.md:19 references/datasets.md:20 references/datasets.md:21 references/datasets.md:24 references/datasets.md:25 references/datasets.md:26 references/datasets.md:27 references/datasets.md:28 references/datasets.md:29 references/datasets.md:30 references/datasets.md:31 references/datasets.md:34 references/datasets.md:35 references/datasets.md:36 references/datasets.md:39 references/datasets.md:40 references/datasets.md:41 references/datasets.md:46 references/datasets.md:47 references/datasets.md:48 references/datasets.md:49 references/datasets.md:50 references/datasets.md:51 references/datasets.md:54 references/datasets.md:55 references/datasets.md:56 references/datasets.md:57 references/datasets.md:58 references/datasets.md:59 references/datasets.md:62 references/datasets.md:63 references/datasets.md:64 references/datasets.md:65 references/datasets.md:66 references/datasets.md:67 references/datasets.md:68 references/datasets.md:69 references/datasets.md:70 references/datasets.md:71 references/datasets.md:72 references/datasets.md:73 references/datasets.md:78 references/datasets.md:79 references/datasets.md:80 references/datasets.md:83 references/datasets.md:84 references/datasets.md:85 references/datasets.md:86 references/datasets.md:87 references/datasets.md:91 references/datasets.md:92 references/datasets.md:93 references/datasets.md:97 references/datasets.md:98 references/datasets.md:102 references/datasets.md:103 references/datasets.md:107 references/datasets.md:108 references/datasets.md:112 references/datasets.md:119 references/datasets.md:120 references/datasets.md:121 references/datasets.md:124 references/datasets.md:125 references/datasets.md:128 references/datasets.md:129 references/datasets.md:133 references/datasets.md:134 references/datasets.md:138 references/datasets.md:139 references/datasets.md:143 references/datasets.md:144 references/datasets.md:148 references/datasets.md:149 references/datasets.md:153 references/datasets.md:154 references/datasets.md:155 references/datasets.md:159 references/datasets.md:160 references/datasets.md:164 references/datasets.md:168 references/datasets.md:172 references/datasets.md:176 references/datasets.md:182 references/datasets.md:183 references/datasets.md:184 references/datasets.md:185 references/datasets.md:189 references/datasets.md:190 references/datasets.md:194 references/datasets.md:195 references/datasets.md:201-209 references/datasets.md:209-215 references/datasets.md:215-224 references/datasets.md:224-230 references/datasets.md:230-239 references/oracles.md:16-27 references/oracles.md:27-40 references/oracles.md:40-43 references/oracles.md:43-49 references/oracles.md:49-52 references/oracles.md:52-58 references/oracles.md:58-61 references/oracles.md:61-67 references/oracles.md:67-70 references/oracles.md:70-76 references/oracles.md:76-79 references/oracles.md:79-85 references/oracles.md:85-88 references/oracles.md:88-93 references/oracles.md:93-96 references/oracles.md:96-101 references/oracles.md:101-104 references/oracles.md:104-109 references/oracles.md:109-112 references/oracles.md:112-117 references/oracles.md:117-120 references/oracles.md:120-131 references/oracles.md:131-134 references/oracles.md:134-140 references/oracles.md:140-143 references/oracles.md:143-151 references/oracles.md:151-154 references/oracles.md:154-160 references/oracles.md:160-163 references/oracles.md:163-169 references/oracles.md:169-172 references/oracles.md:172-181 references/oracles.md:181-184 references/oracles.md:184-189 references/oracles.md:189-192 references/oracles.md:192-197 references/oracles.md:197-200 references/oracles.md:200-205 references/oracles.md:205-208 references/oracles.md:208-213 references/oracles.md:213-216 references/oracles.md:216-221 references/oracles.md:221-224 references/oracles.md:224-231 references/oracles.md:231-234 references/oracles.md:234-240 references/oracles.md:240-243 references/oracles.md:243-249 references/oracles.md:249-252 references/oracles.md:252-261 references/oracles.md:261-280 references/oracles.md:280-295 references/oracles.md:295-300 references/oracles.md:300-305 references/oracles.md:305-327 references/oracles.md:327-330 references/oracles.md:330-350 references/oracles.md:350-354 references/oracles.md:354-366 references/oracles.md:366-380 references/oracles.md:380-394 references/utilities.md:19-34 references/utilities.md:34-41 references/utilities.md:41-43 references/utilities.md:43-57 references/utilities.md:57-59 references/utilities.md:59-76 references/utilities.md:76-80 references/utilities.md:80-85 references/utilities.md:85-87 references/utilities.md:87-92 references/utilities.md:92-94 references/utilities.md:94-101 references/utilities.md:101-103 references/utilities.md:103-112 references/utilities.md:112-118 references/utilities.md:118-124 references/utilities.md:124-126 references/utilities.md:126-136 references/utilities.md:136-144 references/utilities.md:144-151 references/utilities.md:151-154 references/utilities.md:154-166 references/utilities.md:166-169 references/utilities.md:169-181 references/utilities.md:181-184 references/utilities.md:184-195 references/utilities.md:195-198 references/utilities.md:198-209 references/utilities.md:209-212 references/utilities.md:212-219 references/utilities.md:219-222 references/utilities.md:222-231 references/utilities.md:231-234 references/utilities.md:234-243 references/utilities.md:243-246 references/utilities.md:246-255 references/utilities.md:255-258 references/utilities.md:258-267 references/utilities.md:267-270 references/utilities.md:270-282 references/utilities.md:282-285 references/utilities.md:285-295 references/utilities.md:295-298 references/utilities.md:298-300 references/utilities.md:300 references/utilities.md:300 references/utilities.md:300 references/utilities.md:300-306 references/utilities.md:306-321 references/utilities.md:321-331 references/utilities.md:331-345 references/utilities.md:345-354 references/utilities.md:354-357 references/utilities.md:357-363 references/utilities.md:363-379 references/utilities.md:379-382 references/utilities.md:382-383 references/utilities.md:383-384 references/utilities.md:384-385 references/utilities.md:385-386 references/utilities.md:386-387 references/utilities.md:387-391 references/utilities.md:391-397 references/utilities.md:397-405 references/utilities.md:405-412 references/utilities.md:412-418 references/utilities.md:418-426 references/utilities.md:426-437 references/utilities.md:437-442 references/utilities.md:442-448 references/utilities.md:448-456 references/utilities.md:456-462 references/utilities.md:462-470 references/utilities.md:470-476 references/utilities.md:476-486 references/utilities.md:486-498 references/utilities.md:498-503 references/utilities.md:503-506 references/utilities.md:506-511 references/utilities.md:511-514 references/utilities.md:514-520 references/utilities.md:520-526 references/utilities.md:526-535 references/utilities.md:535-541 references/utilities.md:541-548 references/utilities.md:548-552 references/utilities.md:552-564 references/utilities.md:564-568 references/utilities.md:568-580 references/utilities.md:580-586 references/utilities.md:586-615 references/utilities.md:615-619 references/utilities.md:619-637 references/utilities.md:637-641 references/utilities.md:641-661 SKILL.md:30-32 SKILL.md:32-36 SKILL.md:36-38 SKILL.md:38-49 SKILL.md:49-54 SKILL.md:54-57 SKILL.md:57 SKILL.md:57 SKILL.md:57 SKILL.md:57-58 SKILL.md:58-59 SKILL.md:59-63 SKILL.md:63-68 SKILL.md:68-80 SKILL.md:80-84 SKILL.md:84-99 SKILL.md:99-103 SKILL.md:103-116 SKILL.md:116-119 SKILL.md:119-125 SKILL.md:125-128 SKILL.md:128-140 SKILL.md:140 SKILL.md:140-141 SKILL.md:141 SKILL.md:141-142 SKILL.md:142-154 SKILL.md:154-158 SKILL.md:158-172 SKILL.md:172-176 SKILL.md:176-184 SKILL.md:184-187 SKILL.md:187-208 SKILL.md:208-212 SKILL.md:212-216 SKILL.md:216-220 SKILL.md:220-222 SKILL.md:222-228 SKILL.md:228-232 SKILL.md:232-240 SKILL.md:240-243 SKILL.md:243-245 SKILL.md:245 SKILL.md:245-253 SKILL.md:253-268 SKILL.md:268-280 SKILL.md:280-290 SKILL.md:290-300 SKILL.md:300-303 SKILL.md:303-304 SKILL.md:304-305 SKILL.md:305 SKILL.md:305 SKILL.md:305-306 SKILL.md:306-312 SKILL.md:312-322 SKILL.md:322-330 SKILL.md:330-336 SKILL.md:336-347 SKILL.md:347-353 SKILL.md:353-363 SKILL.md:363-365 SKILL.md:365-371 SKILL.md:371-379 SKILL.md:379-383 SKILL.md:383-390 SKILL.md:390-394 SKILL.md:394-402 SKILL.md:402-408 SKILL.md:408-410 SKILL.md:410-426 SKILL.md:426-430 SKILL.md:430-434 SKILL.md:434-442 SKILL.md:442-443 SKILL.md:443-444 SKILL.md:444-448 SKILL.md:448-449 SKILL.md:449-450
🌐 الوصول إلى الشبكة (15)
تم تدقيقه بواسطة: claude عرض سجل التدقيق →

درجة الجودة

68
الهندسة المعمارية
100
قابلية الصيانة
87
المحتوى
20
المجتمع
100
الأمان
83
الامتثال للمواصفات

ماذا يمكنك بناءه

Train ADME Prediction Models

Load Caco-2 permeability data with scaffold splits, train molecular property predictors, and evaluate with standard metrics.

Evaluate Toxicity Predictors

Access hERG, AMES, and DILI toxicity datasets with benchmark protocols to validate safety prediction models.

Generate Novel Drug Candidates

Use molecular oracles like GSK3B and DRD2 to guide generative models toward compounds with desired biological activity.

جرّب هذه الموجهات

Load ADME Dataset
Help me load the Caco2_Wang dataset from TDC with scaffold splitting for training an intestinal permeability predictor.
Run Benchmark Evaluation
Show me how to evaluate my ADME model using the TDC benchmark group with the required 5-seed protocol.
Use Molecular Oracles
I want to evaluate generated SMILES strings using TDC oracles for QED, SA, and GSK3B properties. Show me the workflow.
Drug-Target Interaction Modeling
Load the BindingDB_Kd dataset with cold-drug splitting to ensure my model generalizes to unseen drug compounds.

أفضل الممارسات

  • Use scaffold splits instead of random splits for realistic model evaluation on novel chemical scaffolds
  • Run benchmark evaluations with all 5 required seeds to report mean and standard deviation performance
  • Combine multiple oracles with weighted scoring for multi-objective molecular optimization

تجنب

  • Avoid random splits for production ADME models as they overestimate performance on similar molecules
  • Do not report single-seed benchmark results as they may not reflect true model variance
  • Avoid using oracles as ground truth labels for training since they are predictive models themselves

الأسئلة المتكررة

What datasets are available in PyTDC?
PyTDC includes 60+ datasets covering ADME, toxicity, drug-target interactions, drug-drug interactions, and molecular generation tasks for therapeutic ML.
What is a scaffold split and why use it?
Scaffold splits group molecules by chemical scaffold so test molecules have different core structures than training molecules, simulating real-world generalization.
How do molecular oracles work?
Oracles are pre-trained models that score SMILES strings for properties like drug-likeness (QED), synthetic accessibility (SA), or target binding (GSK3B, DRD2).
What is the 5-seed protocol for benchmarks?
TDC benchmarks require evaluation with 5 different random seeds to compute mean and standard deviation, ensuring robust performance comparisons.
Can I use PyTDC with PyTorch Geometric or DGL?
Yes, TDC provides MolConvert utilities to transform SMILES into PyG graphs, DGL graphs, or other molecular representations like ECFP fingerprints.
What is a cold-drug split for DTI prediction?
Cold-drug splits ensure test set drugs never appear in training, measuring how well models predict binding for completely novel drug compounds.

تفاصيل المطور

المؤلف

K-Dense-AI

الترخيص

MIT license

مرجع

main