GETTING EXPRESSION DATA FOR Cheung/Spielman paper replications
--------------------------------------------------------------

On Oct 10 2006, the following command (named ko)

curl -O ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/samples/GSM$1/GSM$1.CEL.gz

was applied for a series of 5 digit accession numbers corresponding to samples identified
as coming from 58 unrelated individuals in the  CEPH

./ko 25349 ./ko 25356 ./ko 25358 ./ko 25360
./ko 25377 ./ko 25385 ./ko 25399 ./ko 25401 ./ko 25409 ./ko 25426 ./ko 25479 ./ko 25481
./ko 25524 ./ko 25526 ./ko 25528 ./ko 25530 ./ko 25540 ./ko 25542 ./ko 25548 ./ko 25550
./ko 25552 ./ko 25561 ./ko 25563 ./ko 25565 ./ko 25568 ./ko 25570 ./ko 25578 ./ko 25580
./ko 25624 ./ko 25626 ./ko 25628 ./ko 25630 ./ko 25632 ./ko 25634 ./ko 25656 ./ko 25658
./ko 25660 ./ko 25662 ./ko 25680 ./ko 25682 ./ko 25684 ./ko 25686 ./ko 48650 ./ko 48651
./ko 48652 ./ko 48653 ./ko 48654 ./ko 48655 ./ko 48656 ./ko 48657 ./ko 48658 ./ko 48659
./ko 48660 ./ko 48661 ./ko 48662 ./ko 48663 ./ko 48664 ./ko 48665

producing the collection

GSM25349.CEL.gz GSM25426.CEL.gz GSM25548.CEL.gz GSM25580.CEL.gz GSM25660.CEL.gz GSM48653.CEL.gz GSM48662.CEL.gz
GSM25356.CEL.gz GSM25479.CEL.gz GSM25550.CEL.gz GSM25624.CEL.gz GSM25662.CEL.gz GSM48654.CEL.gz GSM48663.CEL.gz
GSM25358.CEL.gz GSM25481.CEL.gz GSM25552.CEL.gz GSM25626.CEL.gz GSM25680.CEL.gz GSM48655.CEL.gz GSM48664.CEL.gz
GSM25360.CEL.gz GSM25524.CEL.gz GSM25561.CEL.gz GSM25628.CEL.gz GSM25682.CEL.gz GSM48656.CEL.gz GSM48665.CEL.gz
GSM25377.CEL.gz GSM25526.CEL.gz GSM25563.CEL.gz GSM25630.CEL.gz GSM25684.CEL.gz GSM48657.CEL.gz
GSM25385.CEL.gz GSM25528.CEL.gz GSM25565.CEL.gz GSM25632.CEL.gz GSM25686.CEL.gz GSM48658.CEL.gz
GSM25399.CEL.gz GSM25530.CEL.gz GSM25568.CEL.gz GSM25634.CEL.gz GSM48650.CEL.gz GSM48659.CEL.gz
GSM25401.CEL.gz GSM25540.CEL.gz GSM25570.CEL.gz GSM25656.CEL.gz GSM48651.CEL.gz GSM48660.CEL.gz
GSM25409.CEL.gz GSM25542.CEL.gz GSM25578.CEL.gz GSM25658.CEL.gz GSM48652.CEL.gz GSM48661.CEL.gz

to which RMA is applied, using bioconductor 1.9

Mapping from GSM number to NAxxxxx indexing of CEPH participants is in sample.IDs

"ceph_id" "sample_name" "geoacc"
"1" "1334.1" "12144" "GSM25548"
"2" "1334.1" "12144" "GSM25549"
"3" "1334.11" "12145" "GSM25550"
"4" "1334.11" "12145" "GSM25551"
"5" "1334.12" "12146" "GSM25552"
"6" "1334.12" "12146" "GSM25553"
"7" "1334.13" "12239" "GSM25570"
"8" "1334.13" "12239" "GSM25571"
"9" "1340.09" "06994" "GSM25358"
"10" "1340.09" "06994" "GSM25359"
"11" "1340.1" "07000" "GSM25360"
"12" "1340.1" "07000" "GSM25361"
"13" "1340.11" "07022" "GSM25377"
"14" "1340.11" "07022" "GSM25378"
"15" "1340.12" "07056" "GSM25401"
"16" "1340.12" "07056" "GSM25402"
"17" "1341.11" "07034" "GSM25385"
"18" "1341.11" "07034" "GSM25386"
"19" "1341.12" "07055" "GSM25399"
"20" "1341.12" "07055" "GSM25400"
"21" "1341.13" "06993" "GSM25356"
"22" "1341.13" "06993" "GSM25357"
"23" "1341.14" "06985" "GSM25349"
"24" "1341.14" "06985" "GSM25350"
"25" "1344.13" "12057" "GSM48660"
"26" "1345.12" "07357" "GSM25426"
"27" "1345.12" "07357" "GSM25427"
"28" "1345.13" "07345" "GSM25409"
"29" "1345.13" "07345" "GSM25410"
"30" "1346.11" "12043" "GSM25540"
"31" "1346.11" "12043" "GSM25541"
"32" "1346.12" "12044" "GSM25542"
"33" "1346.12" "12044" "GSM25543"
"34" "1347.14" "11881" "GSM25479"
"35" "1347.14" "11881" "GSM25480"
"36" "1347.15" "11882" "GSM25481"
"37" "1347.15" "11882" "GSM25482"
"38" "1349.13" "11839" "GSM48654"
"39" "1350.1" "11829" "GSM48650"
"40" "1350.11" "11830" "GSM48651"
"41" "1350.12" "11831" "GSM48652"
"42" "1350.13" "11832" "GSM48653"
"43" "1358.11" "12716" "GSM48662"
"44" "1358.12" "12717" "GSM48663"
"45" "1362.13" "11992" "GSM25524"
"46" "1362.13" "11992" "GSM25525"
"47" "1362.14" "11993" "GSM25526"
"48" "1362.14" "11993" "GSM25527"
"49" "1362.15" "11994" "GSM25528"
"50" "1362.15" "11994" "GSM25529"
"51" "1362.16" "11995" "GSM25530"
"52" "1362.16" "11995" "GSM25531"
"53" "1375.12" "12234" "GSM48661"
"54" "1408.1" "12154" "GSM25561"
"55" "1408.1" "12154" "GSM25562"
"56" "1408.11" "12236" "GSM25568"
"57" "1408.11" "12236" "GSM25569"
"58" "1408.12" "12155" "GSM25563"
"59" "1408.12" "12155" "GSM25564"
"60" "1408.13" "12156" "GSM25565"
"61" "1408.13" "12156" "GSM25566"
"62" "1416.11" "12248" "GSM25578"
"63" "1416.11" "12248" "GSM25579"
"64" "1416.12" "12249" "GSM25580"
"65" "1416.12" "12249" "GSM25581"
"66" "1420.09" "12003" "GSM48655"
"67" "1420.1" "12004" "GSM48656"
"68" "1420.11" "12005" "GSM48657"
"69" "1420.12" "12006" "GSM48658"
"70" "1444.13" "12750" "GSM25624"
"71" "1444.13" "12750" "GSM25625"
"72" "1444.14" "12751" "GSM25626"
"73" "1444.14" "12751" "GSM25627"
"74" "1447.09" "12760" "GSM25628"
"75" "1447.09" "12760" "GSM25629"
"76" "1447.1" "12761" "GSM25630"
"77" "1447.1" "12761" "GSM25631"
"78" "1447.11" "12762" "GSM25632"
"79" "1447.11" "12762" "GSM25633"
"80" "1447.12" "12763" "GSM25634"
"81" "1447.12" "12763" "GSM25635"
"82" "1454.12" "12812" "GSM25656"
"83" "1454.12" "12812" "GSM25657"
"84" "1454.13" "12813" "GSM25658"
"85" "1454.13" "12813" "GSM25659"
"86" "1454.14" "12814" "GSM25660"
"87" "1454.14" "12814" "GSM25661"
"88" "1454.15" "12815" "GSM25662"
"89" "1454.15" "12815" "GSM25663"
"90" "1459.09" "12872" "GSM25680"
"91" "1459.09" "12872" "GSM25681"
"92" "1459.1" "12873" "GSM25682"
"93" "1459.1" "12873" "GSM25683"
"94" "1459.11" "12874" "GSM25684"
"95" "1459.11" "12874" "GSM25685"
"96" "1459.12" "12875" "GSM25686"
"97" "1459.12" "12875" "GSM25687"
"98" "1463.15" "12891" "GSM48664"
"99" "1463.16" "12892" "GSM48665"
"100" NA "12056" "GSM48659"

The resulting expression matrix needs to be connected to the genotypes, using HMworkflow
in GGtools

GETTING GENOTYPE DATA
---------------------

see getall.sh, run on Sep 4 2006


wget http://www.hapmap.org/genotypes/latest/fwd_strand/non-redundant/genotypes_chr1_CEU_r21_nr_fwd.txt.gz
wget http://www.hapmap.org/genotypes/latest/fwd_strand/non-redundant/genotypes_chr2_CEU_r21_nr_fwd.txt.gz

...
