Work to feed myself but not for satisfaction. Be satisfied with God only. Yet an attitude matters. Always be responsible.

Wednesday, March 09, 2005

Datasets

WHO’s Communicable Disease Global Atlas
http://globalatlas.who.int/

UCI Knowledge Discovery in Databases Archive
http://kdd.ics.uci.edu/

Raw input data for "small" Sequoia benchmark
http://epoch.cs.berkeley.edu:8000/sequoia/benchmark/

Kent Ridge Biomedical Data Set Repository
http://sdmc.i2r.a-star.edu.sg/rp/
Summarization of Kent Ridge Biomedical Data Set Repository:
Breast Cancer: 78 +19 samples, 24481 features(genes), 2 classes;
Central Nervous System : 60 samples, 7129 genes, 2 classes;
Colon Tumor: 62 samples, 2000 genes, 2 classes;

Diffuse Large B-Cell Lymphoma (DLBCL)
DLBCL-Stanford: 47 samples, 4026 genes, two classes;
DLBCL-Harvard : 58+19 samples, 6817 genes, 2 classes;
DLBCL-NIH: 240 samples, 7399 microarray features, 2 classes;

Leukemia
Leukemia-ALLAML (WhiteHead, MIT) : 38+34 samples, 7129 probes from 6817 genes, 2 classes;
Leukemia-MLL (WhiteHead, MIT) : 57+15 samples, 12582 genes, 3 classes;
Leukemia-subtype (Stjude) : 215+112 samples, 12558 genes, 7 classes;

Lung Cancer
LungCancer-DanaFarberCancerInstitute-HarvardMedicalSchool : 203 samples, 12600 genes, 5 classes;
LungCancer-BrighamAndWomenHospital-HarvardMedicalSchool : 181 samples, 12533 genes, 2 classes;
LungCancer-Michigan : 86+10 samples, 7129 genes, 2 classes;
LungCancer-Ontario : 39 samples, 2880 genes, 2 classes;

Ovarian Cancer
OvarianCancer-NCI-PBSII-061902 : 91+162 samples, 15154 M/Z identities, 2 classes; OvarianCancer-NCI-QStar : 216 samples, 373401 features, 2 classes;

Prostate Cancer : (a) 52+50+25+9 samples, 126000 genes, two classes; (b) 21 samples, two classes

Genomic Sequences
Translation Initiation Site Prediction : 3312 sequences, 927 features, two classes; Polyadenylation Signal Prediction: 2327 (training) + 982 (testing), 168 features, two classes

0 Comments:

Post a Comment

<< Home