xbat              package:GeneticsBase              R Documentation

_S_i_m_u_l_a_t_e_d _p_e_d_i_g_r_e_e _w_i_t_h _g_e_n_o_t_y_p_e_s _a_n_d _c_o_v_a_r_i_a_t_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Simulated dataset for a pedigree of 1000 trios with 50 SNPs, with 
     8 quantitative traits, 2 binary traits, and 8 covariates."

_U_s_a_g_e:

     data(xbat)

_F_o_r_m_a_t:

     'geneSet' object

_D_e_t_a_i_l_s:

     This data is the 'xbat' example from Lange and Kraft's "Short
     Course: Genetics Associateion Analysis."  It is described there
     as:

     "[This] simulated dataset comprises a pedigree file with genotype
     information for 1000 trios with 50 SNPs and a phenotype file that
     contains 8 quantitative traits, 2 binary traits, and 8 covariates.

     "Genotypes

     "The simulation generated complete genotype data for 1000 families
     with two parents and one offspring.  The single nucleotide
     polymorphism (SNP) frequencies and haplotype blocks were estimated
     using real data.  These estimates were fixed and used as
     parameters for the simulation of the parental genotypes. 
     Offspring genotypes were generated by simulating random Mendelian
     transmission from their respective parents.  In total, 50 SNPs
     were simulated, 28 of which lie in 1 of 5 variable length
     haplotype blocks (range: 4 to 10 SNPs per block).  The blocks were
     simulated as a function of haplotype block frequency, assuming no
     recombination, resulting in varying degrees of linkage
     disequilibrium within each block.  The remaining 22 SNPs that are
     not in a haplotype block were simulated randomly as a function of
     SNP frequency.  The SNPs are indicated in the header line of the
     pedigree file, and named SNP1, SNP2, .., and SNP50.  Note that the
     affectation status variable in the pedigree file is coded as
     missing (0) for all individuals.  All phenotype data comes from
     the phenotype file (see below).

     "Phenotypes

     "Overall, 10 phenotypes (Y) were simulated additively as function
     of the genetic effect size a, marker score X, covariate effect
     size b, and covariate value Z as follows:

     Y[i] = a[i] X[i] + b[i] Z[i]   (i = 1, 2,.., 10)

     "Quantitative Traits

     "Eight quantitative phenotypes were simulated from a random sample
     from a normal distribution: Y~N([aX+ bZ], s2), where a is the
     additive effect for the phenotype and s2 is the variance.  We
     measure the strength of the additive effect relative to the
     phenotypic variance by the heritability h2 [Falconer and Mackay,
     1997], which is the proportion of phenotypic variation explained
     by genetic variation.  We assume that the environment variance is
     1.  SNP23 was simulated as the "disease SNP" which is the 5th SNP
     in a 10 SNP haplotype block.  The heritabilities were simulated
     from random uniform distribution ranging from -0.1 to 0.1.  In
     addition, the simulation produced two correlated quantitative
     traits (QTL9 and QTL10; r2 = 0.40). The quantitative traits are
     indicated in the header line of the phenotype file and named QTL1,
     QTL2, .., and QTL10.

     "Binary Traits

     "Two binary traits were simulated simply by dichotomizing the
     first quantitative trait (QTL1).  For the AFF1 trait, individuals
     were coded as affected (1) if their QTL1 value is above the sample
     mean and unaffected (0) if their QTL1 value was below the sample
     mean.  For the AFF2 trait, individuals were coded as affected (1)
     if their QTL1 value is at least one standard deviation above the
     sample mean, and missing ("-") if their trait value did not reach
     that criteria.

     "Covariates

     "In addition to the additive genetic effect, each phenotype was
     simulated with one covariate effect.  The quantitative covariates
     were sampled from normal distribution ( = random, s2 = 10).  The
     effect size for each covariate was sampled randomly from a uniform
     distribution (0, 1).  The covariates are indicated in the header
     line of the phenotype file and named COV1, COV2, .., and COV10. 
     Note that COV1 corresponds to QTL1, AFF1 and AFF2."

     (quoted from Lange and Kraft 2005)

_S_o_u_r_c_e:

     Lange, C. and Kraft, P. (2005). "Short Course: Genetics
     Association Analysis."

_R_e_f_e_r_e_n_c_e_s:

     Lange, C. and Kraft, P. (2005). "Short Course: Genetics
     Association Analysis." 

     DeMeo, D. L., C. Lange, et al. (2002). "Univariate and
     multivariate family-based association analysis of the IL-13
     ARG130GLN polymorphism in the Childhood Asthma Management
     Program." Genet Epidemiol 23(4): 335-48.

_E_x_a_m_p_l_e_s:

     library(GeneticsBase)
     data(xbat)
     head(xbat)

