SJUT Cerebellum mRNA M430 (Mar05) RMA

Download datasets and supplementary data files

Summary

This March 2005 data freeze provides estimates of mRNA expression in adult cerebellum of 48 lines of mice including 45 BXD recombinant inbred strains, C57BL/6J, DBA/2J, and F1 hybrids. Data were generated by a consortium of investigators at St. Jude Children's Research Hospital (SJ) and the University of Tennessee Health Science Center (UT). Cerebellar samples were hybridized in small pools (n = 3) to Affymetrix M430A and B arrays. This particular data set was processed using the Microarray Suite 5 (<a data-cke-saved-href="http://www.affymetrix.com/support/technical/whitepapers/sadd_whitepaper.pdf" href="http://www.affymetrix.com/support/technical/whitepapers/sadd_whitepaper.pdf" _blank"="" class="fs14">MAS 5) protocol. To simplify comparisons among transforms, MAS5 values of each array were adjusted to an average of 8 units and a standard deviation of 2 units.

About cases

We have exploited a set of BXD recombinant inbred strains. All BXD lines are derived crossed between C57BL/6J (B6 or B) and DBA/2J (D2 or D). Both B and D parental strains have been almost fully sequenced (8x coverage for B6 by a public consortium and approximately 1.5x coverage for D by Celera Discovery Systems) and data for 1.75 millioin B vs D SNPs are incorporated into WebQTLs genetic maps for the BXDs. BXD2 through BXD32 were produced by Benjamin A. Taylor starting in the late 1970s. BXD33 through 42 were also produced by Taylor, but they were generated in the 1990s. These strains are all available from The Jackson Laboratory, Bar Harbor, Maine. BXD43 through BXD99 were produced by Lu Lu, Jeremy Peirce, Lee M. Silver, and Robert W. Williams in the late 1990s and early 2000s using advanced intercross progeny (Peirce et al. 2004).

 

Most BXD animals were generated in-house at the University of Tennessee Health Science Center by Lu Lu and Robert Williams using stock obtained from The Jackson Laboratory between 1999 and 2004. All BXD strains with numbers above 42 are new advanced intecross type BXDs (Peirce et al. 2004) that are current available from UTHSC. Additional cases were provided by Glenn Rosen, John Mountz, and Hui-Chen Hsu. These cases were bred either at The Jackson Laboratory (GR) or at the University of Alabama (JM and HCH).

About tissue

The March 2005 data set consists of a total of 102 array pairs (Affymetrix 430A and 430B) from 49 different genotypes. Each sample consists of whole cerebellum taken from three adult animals of the same age and sex. Two sets of technical replicates (BXD14 n = 2; BXD29 n = 3) were combined before generating group means; giving a total of 101 biologically independent data sets. The two reciprocal F1s (D2B6F1 and B6D2F1) were combined to give a single F1 mean estimate of gene expression. 430A and 430B arrays were processed in three large batches. The first batch (May03 data) consists of 17 samples from 17 strains balanced by sex (8M and 9F). The second batch consists of 29 samples, and includes biological replicates, 2 technical replicates, and data for 9 new strains. The third batch consists of 56 samples, and also includes biological replicates, 2 technical replicates, and data for 15 additional strains.

Replication and Sample Balance: Our goal is to obtain data for independent biological sample pools from both sexes for each strain. Six of 48 genotypes are still represented by single samples: BXD5, BXD13, BXD20, BXD23, BXD27 are female-only strains, whereas BXD25, BXD77, BXD90 are male-only. Ten strains are represented by three independent samples with the following breakdown by sex: C57BL/6J (1F 2M), DBA/2J (2F 1M), B6D2F1 (1F 2M), BXD2 (2F 1M), BXD11 (2F 1M), BXD28 (2F 1M), BXD40 (2F 1M), BXD51 (1F 2M), BXD60 (1F 2M), BXD92 (2F 1M).

The age range of samples is relatively narrow. Only 18 samples were taken from animals older than 99 days and only two samples are older than 7 months of age. BXD11 includes an extra (third) 441-day-old female sample and the BXD28 includes an extra 427-day-old sample.

RNA was extracted at UTHSC by Lu Lu, Zhiping Jia, and Hongtao Zhai.

All samples were subsequently processed at the Hartwell Center Affymetrix laboratory at SJCRH by Jay Morris.

The table below summarizes information on strain, sex, age, sample name, and batch number.
Id Strain Sex Age

SampleName

BatchID

Source
1 C57BL/6J F 116

R0773C

2

UAB
2 C57BL/6J M 109

R0054C

1

JAX
3 C57BL/6J M 71

R1450C

3

UTM DG
4 DBA/2J F 71

R0175C

1

UAB
5 DBA/2J F 91

R0782C

2

UAB
6 DBA/2J M 62

R1121C

3

UTM RW
7 B6D2F1 F 60

R1115C

3

UTM RW
8 B6D2F1 M 94

R0347C

1

JAX
9 B6D2F1 M 127

R0766C

2

UTM JB
10 D2B6F1 F 57

R1067C

3

UTM RW
11 D2B6F1 M 60

R1387C

3

UTM RW
12 BXD1 F 57

R0813C

2

UAB
13 BXD1 M 181

R1151C

3

UTM JB
14 BXD2 F 142

R0751C

1

UAB
15 BXD2 F 78

R0774C

2

UAB
16 BXD2 M 61

R1503C

3

HarvardU GR
17 BXD5 F 56

R0802C

2

UMemphis
18 BXD6 F 92

R0719C

1

UMemphis
19 BXD6 M 92

R0720C

3

UMemphis
20 BXD8 F 72

R0173C

1

UAB
21 BXD8 M 59

R1484C

3

HarvardU GR
22 BXD9 F 86

R0736C

3

UMemphis
23 BXD9 M 86

R0737C

1

UMemphis
24 BXD11 F 441

R0200C

1

UAB
25 BXD11 F 97

R0791C

3

UAB
26 BXD11 M 92

R0790C

2

UMemphis
27 BXD12 F 130

R0776C

2

UAB
28 BXD12 M 64

R0756C

2

UMemphis
29 BXD13 F 86

R1144C

3

UMemphis
30 BXD14 F 190

R0794C

2

UAB
31 BXD14 F 190

R0794C

3

UAB
32 BXD14 M 91

R0758C

2

UMemphis
33 BXD14 M 65

R1130C

3

UTM RW
34 BXD15 F 60

R1491C

3

HarvardU GR
35 BXD15 M 61

R1499C

3

HarvardU GR
36 BXD16 F 163

R0750C

1

UAB
37 BXD16 M 61

R1572C

3

HarvardU GR
38 BXD19 F 61

R0772C

2

UAB
39 BXD19 M 157

R1230C

3

UTM JB
40 BXD20 F 59

R1488C

3

HarvardU GR
41 BXD21 F 116

R0711C

1

UAB
42 BXD21 M 64

R0803C

2

UMemphis
43 BXD22 F 65

R0174C

1

UAB
44 BXD22 M 59

R1489C

3

HarvardU GR
45 BXD23 F 88

R0814C

2

UAB
46 BXD24 F 71

R0805C

2

UMemphis
47 BXD24 M 71

R0759C

2

UMemphis
48 BXD25 M 90

R0429C

1

UTM RW
49 BXD27 F 60

R1496C

3

HarvardU GR
50 BXD28 F 113

R0785C

2

UTM RW
51 BXD28 M 79

R0739C

3

UMemphis
52 BXD29 F 82

R0777C

2

UAB
53 BXD29 M 76

R0714C

1

UMemphis
54 BXD29 M 76

R0714C

2

UMemphis
55 BXD29 M 76

R0714C

3

UMemphis
56 BXD31 F 142

R0816C

2

UAB
57 BXD31 M 61

R1142C

3

UTM RW
58 BXD32 F 62

R0778C

2

UAB
59 BXD32 M 218

R0786C

2

UAB
60 BXD33 F 184

R0793C

2

UAB
61 BXD33 M 124

R0715C

1

UAB
62 BXD34 F 56

R0725C

1

UMemphis
63 BXD34 M 91

R0789C

2

UMemphis
64 BXD36 F 64

R1667C

3

UTM RW
65 BXD36 M 61

R1212C

3

UMemphis
66 BXD38 F 55

R0781C

2

UAB
67 BXD38 M 65

R0761C

2

UMemphis
68 BXD39 F 59

R1490C

3

HarvardU GR
69 BXD39 M 165

R0723C

1

UAB
70 BXD40 F 56

R0718C

2

UMemphis
71 BXD40 M 73

R0812C

2

UMemphis
72 BXD42 F 100

R0799C

2

UAB
73 BXD42 M 97

R0709C

1

UMemphis
74 BXD43 F 61

R1200C

3

UTM RW
75 BXD43 M 63

R1182C

3

UTM RW
76 BXD44 F 61

R1188C

3

UTM RW
77 BXD44 M 58

R1073C

3

UTM RW
78 BXD45 F 63

R1404C

3

UTM RW
79 BXD45 M 93

R1506C

3

UTM RW
80 BXD48 F 64

R1158C

3

UTM RW
81 BXD48 M 65

R1165C

3

UTM RW
82 BXD51 F 66

R1666C

3

UTM RW
83 BXD51 M 62

R1180C

3

UTM RW
84 BXD51 M 79

R1671C

3

UTM RW
85 BXD60 F 64

R1160C

3

UTM RW
86 BXD60 M 61

R1103C

3

UTM RW
87 BXD60 M 99

R1669C

3

UTM RW
88 BXD62 M 61

R1149C

3

UTM RW
89 BXD62 M 60

R1668C

3

UTM RW
90 BXD69 F 60

R1440C

3

UTM RW
91 BXD69 M 64

R1197C

3

UTM RW
92 BXD73 F 60

R1276C

3

UTM RW
93 BXD73 M 77

R1665C

3

UTM RW
94 BXD77 M 62

R1424C

3

UTM RW
95 BXD85 F 79

R1486C

3

UTM RW
96 BXD85 M 79

R1487C

3

UTM RW
97 BXD86 F 58

R1408C

3

UTM RW
98 BXD86 M 58

R1412C

3

UTM RW
99 BXD90 M 74

R1664C

3

UTM RW
100 BXD92 F 62

R1391C

3

UTM RW
101 BXD92 F 63

R1670C

3

UTM RW
102 BXD92 M 59

R1308C

3

UTM RW

About platform

Affymetrix Mouse Genome 430A and B array pairs: The 430A and B array pairs consist of 992936 25-nucleotide probes that collectively estimate the expression of approximately 39,000 transcripts. The array sequences were selected late in 2002 using Unigene Build 107. The arrays nominally contain the same probe sequences as the 430 2.0 series. However, we have found that roughy 75000 probes differ from those on A and B arrays and those on the 430 2.0

About data processing

Probe (cell) level data from the CEL file: These CEL values produced by GCOS are 75% quantiles from a set of 91 pixel values per cell.
  • Step 1: We added an offset of 1.0 unit to each cell signal to ensure that all values could be logged without generating negative values. We then computed the log base 2 of each cell.
  • Step 2: We performed a quantile normalization for the log base 2 values for the total set of 104 arrays (all three batches) using the same initial steps used by the RMA transform.
  • Step 3: We computed the Z scores for each cell value.
  • Step 4: We multiplied all Z scores by 2.
  • Step 5: We added 8 to the value of all Z scores. The consequence of this simple set of transformations is to produce a set of Z scores that have a mean of 8, a variance of 4, and a standard deviation of 2. The advantage of this modified Z score is that a two-fold difference in expression level corresponds approximately to a 1 unit difference.
  • Step 6: We corrected for technical variance introduced by three large batches at the probe level. To do this we determined the ratio of the batch mean to the mean of all three batches and used this as a single multiplicative probe-specific batch correction factor. The consequence of this simple correction is that the mean probe signal value for each of the three batches is the same.
  • Step 7a: The 430A and 430B arrays include a set of 100 shared probe sets (a total of 2200 probes) that have identical sequences. These probes and probe sets provide a way to calibrate expression of the 430A and 430B arrays to a common scale. To bring the two arrays into alignment, we regressed Z scores of the common set of probes to obtain a linear regression correction to rescale the 430B arrays to the 430A array. In our case this involved multiplying all 430B Z scores by the slope of the regression and adding or subtracting a small offset. The result of this step is that the mean of the 430A expression is fixed at a value of 8, whereas that of the 430B chip is typically reduced to 7. The average of the merged 430A and 430B array data set is approximately 7.5.
  • Step 7b: We recentered the merged 430A and 430B data sets to a mean of 8 and a standard deviation of 2. This involved reapplying Steps 3 through 5.
  • Step 8: Finally, we computed the arithmetic mean of the values for the set of microarrays for each strain. Technical replciates were averaged before computing the mean for independent biological samples. Note, that we have not (yet) corrected for variance introduced by differences in sex, age, source of animals, or any interaction terms. We have not corrected for background beyond the background correction implemented by Affymetrix in generating the CEL file. We eventually hope to add statistical controls and adjustments for these variables.
Probe set data: The expression data were processed by Yanhua Qu (UTHSC). Probe set data were generated from the fully normalized CEL files (quantile and batch corrected) using the standard MAS 5 Tukey biweight procedure. A 1-unit difference represents roughly a two-fold difference in expression level. Expression levels below 5 are usually close to background noise levels.

Acknowledgment

Data were generated with funds contributed by members of the UTHSC-SJCRH Cerebellum Transcriptome Profiling Consortium. Our members include:
  • Tom Curran
  • Dan Goldowitz
  • Kristin Hamre
  • Lu Lu
  • Peter McKinnon
  • Jim Morgan
  • Clayton Naeve
  • Richard Smeyne
  • Robert Williams
  • The Center of Genomics and Bioinformatics at UTHSC
  • The Hartwell Center at SJCRH

Notes

About the chromosome and megabase position values:

The chromosomal locations of probe sets included on the microarrays were determined by BLAT analysis using the Mouse Genome Sequencing Consortium March 2005 Assembly (see http://genome.ucsc.edu/cgi-bin/hgBlat?command=start&org=mouse). We thank Dr. Yan Cui (UTHSC) for allowing us to use his Linux cluster to perform this analysis.

This text file originally generated by RWW and YHQ, March 21, 2005. Updated by RWW, March 23, 2005; RWW April 8.