Brain, Aging: AD, Normal Gene Expression with Genotypes (Myers)

Alzheimer's disease Cases and Controls Myers (April 2009):

Expression quantitative trait loci study using human brain from 363 cortical samples. Affymetrix 500K chip for genotyping, Illumina ref-seq 8 chip for expression. Genotypes are available at dbGAP.

Please cite: Webster JA, Gibbs JR, Clarke J, Ray M, Zhang W, Holmans P, Rohrer K, Zhao A, Marlowe L, Kaleem M, McCorquodale DS 3rd, Cuello C, Leung D, Bryden L, Nath P, Zismann VL, Joshipura K, Huentelman MJ, Hu-Lince D, Coon KD, Craig DW, Pearson JV; NACC-Neuropathology Group, Heward CB, Reiman EM, Stephan D, Hardy J, Myers AJ (2009) Genetic control of human brain transcript expression in Alzheimer disease. Am J Hum Genet 84:445-58.

Summary from GEO: Myers and colleagues generated massive neocortical transcriptome data sets for a set of unrelated elderly neurologically and neuropathologically normal humans and from confirmed late onset Alzheimer's disease patients (LOAD, n = 187 normal and 176 LOAD cases, see DOI:10.1016/j.ajhg.2009.03.011 for detail). They used an Illumina Sentrix Bead array (HumanRef-8) that measures expression of approximately 19,730 curated RefSeq sequences (Human Build 34).

Case identifiers: All case identifiers (IDs) in GeneNetwork begin with a capital C followed by a six digit GEO identifier, followed by the sex and age in years. Non-Alzheimer cases are labeled with the suffix letter N: C225652M85N. Alzheimer cases are labeled with the suffix letter A: C388217F97A.

Data were initially downloaded from the NCBI GEO archive under the experiment ID GSE15222. All data were generated using the Illumina HumanRef-8 expression BeadChip (GPL2700) v2 Rev0. This data set in GeneNetwork includes data for 24,354 probes. We have realigned the 50-mer sequences by BLAT to the latest version of the human genome (Feb 2009, hg19) and reannotated the array (August 2009). The annotation in GN will differ from that provided in GEO for this platform. We were unable to obtain 50-mer sequences for several thousand probes (e.g., HTT), and these probes have therefore not been realigned to the human genome.

The GEO data set was processed by Myers and colleagues using Illumina's Rank Invariant transform. We performed a series of QC and renormalization steps to the data to allow more facile comparison to other data sets in GeneNetwork. In brief, data is log2 transformed. We recentered each array to a mean expression of 8 units and a standard deviation of 2 units (2z + 8 transform). The values are therefore modified z scores and each unit represents roughly a two-fold difference in expression. Average expression across all 363 cases range from a low of 6 units (e.g., SYT15) to a high of 19 units for ARSK. APOE has an average expression of 15 units and APP has an average expression of 11.5 units.. The distribution is far from normal with a great excess of measurements of genes with low to moderate expression clustered between 6.5 and 8.5 units.