UCSD CFW Striatum (Jan17) RNA-Seq Log2 Z-score

Download datasets and supplementary data files


Although mice are the most widely used mammalian model organism, genetic studies have suffered from limited mapping resolution due to extensive linkage disequilibrium (LD) that is characteristic of crosses among inbred strains. Carworth Farms White (CFW) mice are a commercially available outbred mouse population that exhibit rapid LD decay in comparison to other available mouse populations. We performed a genome-wide association study (GWAS) of behavioral, physiological and gene expression phenotypes using 1,200 male CFW mice. We used genotyping by sequencing (GBS) to obtain genotypes at 92,734 SNPs. We also measured gene expression using RNA sequencing in three brain regions. Our study identified numerous behavioral, physiological and expression quantitative trait loci (QTLs). We integrated the behavioral QTL and eQTL results to implicate specific genes, including Azi2 in sensitivity to methamphetamine and Zmynd11 in anxiety-like behavior. The combination of CFW mice, GBS and RNA sequencing constitutes a powerful approach to GWAS in mice. Full article available here.

About cases

Phenotype, genotype and RNA-seq gene expression data is available at


About data processing

FPKM data were quantitle normalized for each ENSEMBL gene (mRNA) model (Identifiers such as ENSMUSG00000093778). Every gene/mRNA therefore has a mean expression value of 0 and a perfectly normal distribution, even if the original distribution was bimodal. The QQ plots is GN will be straight lines. There is no data at all on expression level in this original data set downloaded from http://dx.doi.org/10.5061/dryad.2rs41


A.A.P. conceived the study. C.C.P. and A.A.P. supervised the project. S.G. and P.C. designed and implemented the statistical and bioinformatics analyses with contributions from C.C.P., J.K.P. and A.A.P. N.M.G. designed and executed the RNA-seq and GBS protocols with assistance from E.A. and J.D. C.C.P. performed the behavioral phenotyping with assistance from E.L. and Y.J.P. A.L. performed the muscle and bone phenotyping with input from D.A.B. C.L.A.-B. performed the BMD phenotyping. C.C.P., S.G., P.C. and A.A.P. wrote the manuscript, with input from all co-authors.


The authors wish to acknowledge technical assistance from: D. Godfrey, S. Lionikaite, V. Lionikaite, A.S. Lionikiene and J. Zekos as well as technical and intellectual input from M. Abney, J. Borevitz, K. Broman, N. Cai, R. Cheng, N. Cox, R. Davies, J. Flint, L. Goodstadt, P. Grabowski, B. Harr, E. Leffler, R. Mott, J. Nicod, J. Novembre, A. Price, M. Stephens, D. Weeks and X. Zhou. This project was funded by NIH R01GM097737 and P50DA037844 (A.A.P.), NIH T32DA07255 (C.C.P.), NIH T32GM07197 (N.M.G.), NIH R01AR056280 (D.A.B.), NIH R01AR060234 (C.L.A.-B.), the Fellowship from the Human Frontiers Science Program (P.C.) and the Howard Hughes Medical Institute (J.K.P.).

Specifics of this data set

RNA-Seq Log2 Z-score

In general, the array data that we put in GeneNetwork has been logged and then z normalized, but instead of leaving the mean at 0 and the standard deviation of 1 unit, we shift up to a mean of 8 units and increase the spread by having an standard deviation of 2 units (what we call 2Z + 8 normalized data).  This removes negative values from the tables.