UTHSC Brain mRNA U74Av2 (May03) MAS5

Download datasets and supplementary data files


This May 2003 freeze provides estimates of mRNA expression in brains of BXD recombinant inbred mice measured using Affymetrix U74Av2 microarrays. New users are encouraged to use one of the more recent data sets (December 2003 or March 2004). Data were generated at the University of Tennessee Health Science Center (UTHSC). Over 300 brain samples from 33 strains were hybridized in small pools (n=3) to 97 arrays. Data were processed using the Microarray Suite 5 (MAS 5) protocol of Affymetrix. To simplify comparison between transforms, MAS 5 values of each array were adjusted to an average of 8 units and a variance of 2 units. In general, these MAS 5 transforms do not perform as well as RMA, PDNN, or the new heritability weighted transforms (HW1PM).

About cases

The set of animals used for mapping (a mapping panel) consists of 30 groups of genetically uniform mice of the BXD type. The parental strains are C57BL/6J (B6 or B) and DBA/2J (D2 or D). The first generation hybrid is labeled F1. The F1 hybrids were made by crossing B6 females to D2 males. All other lines are recombinant inbred strains derived from C57BL/6J and DBA/2J crosses. BXD2 through BXD32 were produced by Dr. Benjamin Taylor starting in the late 1970s. BXD33 through BXD42 were also produced by Dr. Taylor, but they were generated in the 1990s. Lines BXD67 and BXD68 are two partially inbred advanced recombinant strains (F8 and F9) that are part of a large set of BXD-Advanced strains being produced by Drs. Robert Williams, Lu Lu, Lee Silver, and Jeremy Peirce. There will eventually be ~45 of these strains. For additional background on recombinant inbred strains, please see http://www.nervenet.org/papers/bxn.html.
The table below lists the arrays by strain, sex, and age. Each array was hybridized to a pool of mRNA from 3 mice.




8 Wks

20 Wks

52 Wks

8 Wks

20 Wks

52 Wks

C57BL/6J (B6) ♂♂♂ DBA/2J (D2) ♂♂♀  
B6D2F1 (F1) ♀ ♀   BXD1 ♀♀  
BXD2 BXD5 ♂♀    
BXD6 BXD8 ♂♀  
BXD9 BXD11 ♀♀  
BXD12   ♂♀ BXD13    
BXD14   ♀♀ BXD15  
BXD16 ♀♀   BXD18
BXD19 BXD21 ♂♂  
BXD22 ♀♀   BXD24 ♀♀  
BXD25 ♀♀ ♀♀   BXD27     ♀♀
BXD28 BXD29  
BXD31 ♀♀ ♀♀   BXD32 ♂♀
BXD33 ♂♀   BXD34 ♂♀  
BXD39 ♂♀   BXD40 ♂♂♀♀    
BXD42 ♂♂ ♀     BXD67    
BXD68 (F9) ♀ ♀            

About tissue

Most expression data are averages based on three microarrays (U74Av2). Each individual array experiment involved a pool of brain tissue (forebrain plus the midbrain, but without the olfactory bulb) that was taken from three adult animals usually of the same age. A total of 97 arrays were used: 74 were female pools and 23 were male pools. Animals ranged in age from 56 to 441 days, usually with a balanced design (one pool at 8 weeks, one pool at ~20 weeks, one pool at approximately 1 year).

About platform

About the array probe set names:

Most probe sets on the U74Av2 array consist of a total of 32 probes, divided into 16 perfect match probes and 16 mismatch controls. Each set of these 25-nucleotide-long probes has an identifier code that includes a unique number, an underscore character, and several suffix characters that highlight design features. The most common probe set suffix is at. This code indicates that the probes should hybridize relatively selectively with the complementary anti-sense target (i.e., the complemenary RNA) produced from a single gene. Other codes include:

  • f_at (sequence family): Some probes in this probe set will hybridize to identical and/or slightly different sequences of related gene transcripts.
  • s_at (similarity constraint): All Probes in this probe set target common sequences found in transcripts from several genes.
  • g_at (common groups): Some probes in this set target identical sequences in multiple genes and some target unique sequences in the intended target gene.
  • r_at (rules dropped): Probe sets for which it was not possible to pick a full set of unique probes using the Affymetrix probe selection rules. Probes were picked after dropping some of the selection rules.
  • i_at (incomplete): Designates probe sets for which there are fewer than the standard numbers of unique probes specified in the design (16 perfect match for the U74Av2).
  • st (sense target): Designates a sense target; almost always generated in error.

Descriptions for the probe set extensions were taken from the Affymetrix GeneChip Expression Analysis Fundamentals.

About data processing

Probe (cell) level data from the CEL file: These CEL values produced by MAS 5 are the 75% quantiles from a set of 36 pixel values per cell (the pixel with the 12th highest value represents the whole cell).
  • Step 1: We added an offset of 1.0 to the CEL expression values for each cell to ensure that all values could be logged without generating negative values.
  • Step 2: We took the log2 of each cell.
  • Step 3: We computed the Z score for each cell.
  • Step 4: We multiplied all Z scores by 2.
  • Step 5: We added 8 to the value of all Z scores. The consequence of this simple set of transformations is to produce a set of Z scores that have a mean of 8, a variance of 4, and a standard deviation of 2. The advantage of this modified Z score is that a two-fold difference in expression level corresponds approximately to a 1 unit difference.
  • Step 6: We computed the arithmetic mean of the values for the set of microarrays for each of the individual strains. We have not corrected for variance introduced by sex, age, or a sex-by-age interaction. We have not corrected for background beyond the background correction implemented by Affymetrix in generating the CEL file.
Probe set data from the .TXT file: These .TXT files were generated using the MAS 5. The same simple steps described above were also applied to these values. Every microarray data set therefore has a mean expression of 8 with a standard deviation of 2. A 1-unit difference therefor represents roughly a 2-fold difference in expression level. Expression levels below 5 are usually close to background noise levels.

About the chromosome and megabase position values:

The chromosomal locations of probe sets and gene markers were initially determined by BLAT analysis using the Mouse Genome Sequencing Consortium OCT 2003 Assembly (see http://genome.ucsc.edu/). We thank Yan Cui (UTHSC) for allowing us to use his Linux cluster to perform this analysis.


Data were generated with funds to RWW from the Dunavant Chair of Excellence, University of Tennessee Health Science Center, Department of Pediatrics. The majority of arrays were processed at Genome Explorations by Divyen Patel. We thank Guomin Zhou for generating advanced intercross stock used to produce most of the new BXD RI strains.


Information about this text file:

This text file originally generated by RWW, EJC, and YHQ, May 2003. Updated by RWW, October 30, 2004.