gpi               package:GeneticsPed               R Documentation

_G_e_n_o_t_y_p_e _p_r_o_b_a_b_i_l_i_t_y _i_n_d_e_x

_D_e_s_c_r_i_p_t_i_o_n:

     'gpi' calculates Genotype Probability Index (GPI), which indicates
     the information content of genotype probabilities derived from
     segregation analysis.

_U_s_a_g_e:

       gpi(gp, hwp)

_A_r_g_u_m_e_n_t_s:

      gp: numeric vector or matrix, individual genotype probabilities

     hwp: numeric vector or matrix, Hard-Weinberg genotype
          probabilities

_D_e_t_a_i_l_s:

     Genotype Probability Index (GPI; Kinghorn, 1997; Percy and
     Kinghorn, 2005) indicates information that is contained in
     multi-allele genotype probabilities for diploids derived from
     segregation analysis, say Thallman et. al (2001a, 2001b). GPI can
     be used as one of the criteria to help identify which ungenotyped
     individuals or loci should be genotyped in order to maximise the
     benefit of genotyping in the population (e.g. Kinghorn, 1999).

     'gp' and 'hwp' arguments accept genotype probabilities for
     multi-allele loci. If there are two alleles (1 and 2), you should
     pass vector of probabilities for genotypes (11 and 12) i.e. one
     value for heterozygotes (12 and 21) and always skipping last
     homozygote. With three alleles this vector should hold
     probabilities for genotypes (11, 12, 13, 22, 23) as also shown
     bellow and in examples. 'hwp' and 'gpLong2Wide' functions can be
     used to ease the setup for 'gp' and 'hwp' arguments.


          2 alleles: 1 and 2
          11 12
          --> no. dimensions = 2

          3 alleles: 1, 2, and 3
          11 12 13
             22 23
          --> no. dimensions = 5

          ...

          5 alleles: 1, 2, 3, 4, and 5
          11 12 13 14 15
             22 23 24 25
                33 34 35
                   44 45
          --> no. dimensions = 14

     In general, number of dimensions (k) for n alleles is equal to:


                      k = (n * (n + 1) / 2) - 1.


     If you have genotype probabilities for more than one individual,
     you can pass them to 'gp' in a matrix form, where each row
     represents genotype probabilities of an individual. In case of
     passing matrix to 'gp', 'hwp' can still accept a vector of
     Hardy-Weinberg genotype probabilities, which will be used for all
     individuals due to recycling. If 'hwp' also gets a matrix, then it
     must be of the same dimension as that one passed to 'gp'.

_V_a_l_u_e:

     Vector of N genotype probability indices, where N is number of
     individuals

_A_u_t_h_o_r(_s):

     Gregor Gorjanc R code, documentation, wrapping into a package;
     Andrew Percy and Brian P. Kinghorn Fortran code

_R_e_f_e_r_e_n_c_e_s:

     Kinghorn, B. P. (1997) An index of information content for
     genotype probabilities derived from segregation analysis.
     _Genetics_ *145*(2):479-483 <URL:
     http://www.genetics.org/cgi/content/abstract/145/2/479>

     Kinghorn, B. P. (1999) Use of segregation analysis to reduce
     genotyping costs. _Journal of Animal Breeding and Genetics_
     *116*(3):175-180 <URL:
     http://dx.doi.org/10.1046/j.1439-0388.1999.00192.x>

     Percy, A. and Kinghorn, B. P. (2005) A genotype probability index
     for multiple alleles and haplotypes. _Journal of Animal Breeding
     and Genetics_ *122*(6):387-392 <URL:
     http://dx.doi.org/10.1111/j.1439-0388.2005.00553.x>

     Thallman, R. M. and Bennet, G. L. and Keele, J. W. and Kappes, S.
     M. (2001a) Efficient computation of genotype probabilities for
     loci with many alleles: I. Allelic peeling. _Journal of Animal
     Science_ *79*(1):26-33 <URL:
     http://jas.fass.org/cgi/reprint/79/1/34>

     Thallman, R. M. and Bennet, G. L. and Keele, J. W. and Kappes, S.
     M. (2001b) Efficient computation of genotype probabilities for
     loci with many alleles: II. Iterative method for large, complex
     pedigrees. _Journal of Animal Science_ *79*(1):34-44 <URL:
     http://jas.fass.org/cgi/reprint/79/1/34>

_S_e_e _A_l_s_o:

     'hwp' and 'gpLong2Wide'

_E_x_a_m_p_l_e_s:

       ## --- Example 1 from Percy and Kinghorn (2005) ---
       ## No. alleles: 2
       ## No. individuals: 1
       ## Individual genotype probabilities:
       ##   Pr(11, 12, 22) = (.1, .5, .4)
       ##
       ## Hardy-Weinberg probabilities:
       ##   Pr(1, 2)   = (.75, .25)
       ##   Pr(11, 12,   (.75^2, 2*.75*.25,
       ##           22) =             .25^2)
       ##               = (.5625, .3750,
       ##                         .0625)

       gp <- c(.1, .5)
       hwp <- c(.5625, .3750)
       gpi(gp=gp, hwp=hwp)

       ## --- Example 1 from Percy and Kinghorn (2005) extended ---
       ## No. alleles: 2
       ## No. individuals: 2
       ## Individual genotype probabilities:
       ##   Pr_1(11, 12, 22) = (.1, .5, .4)
       ##   Pr_2(11, 12, 22) = (.2, .5, .3)

       (gp <- matrix(c(.1, .5, .2, .5), nrow=2, ncol=2, byrow=TRUE))
       gpi(gp=gp, hwp=hwp)

       ## --- Example 2 from Percy and Kinghorn (2005) ---
       ## No. alleles: 3
       ## No. individuals: 1
       ## Individual genotype probabilities:
       ##   Pr(11, 12, 13,   (.1, .5, .0,
       ##          22, 23  =      .4, .0,
       ##              33)            .0)
       ##
       ## Hardy-Weinberg probabilities:
       ##   Pr(1, 2, 3)    = (.75, .25, .0)
       ##   Pr(11, 12, 13,   (.75^2, 2*.75*.25, .0,
       ##          22, 23, =            0.25^2, .0,
       ##              33)                      .0)
       ##                  = (.5625, .3750, .0
       ##                            .0625, .0,
       ##                                   .0)

       gp <- c(.1, .5, .0, .4, .0)
       hwp <- c(.5625, .3750, .0, .0625, .0)
       gpi(gp=gp, hwp=hwp)

       ## --- Example 3 from Percy and Kinghorn (2005) ---
       ## No. alleles: 5
       ## No. individuals: 1
       ## Hardy-Weinberg probabilities:
       ##   Pr(1, 2, 3, 4, 5)   = (.2, .2, .2, .2, .2)
       ##   Pr(11, 12, 13, ...) = (Pr(1)^2, 2*Pr(1)+Pr(2), 2*Pr(1)*Pr(3), ...)
       ##
       ## Individual genotype probabilities:
       ##   Pr(11, 12, 13, ...) = gp / 2
       ##   Pr(12) = Pr(12) + .5

       (hwp <- rep(.2, times=5) %*% t(rep(.2, times=5)))
       hwp <- c(hwp[upper.tri(hwp, diag=TRUE)])
       (hwp <- hwp[1:(length(hwp) - 1)])
       gp <- hwp / 2
       gp[2] <- gp[2] + .5
       gp

       gpi(gp=gp, hwp=hwp)

       ## --- Simulate gp for n alleles and i individuals ---
       n <- 3
       i <- 10

       kAll <- (n*(n+1)/2) # without -1 here!
       k <- kAll - 1
       if(require("gtools")) {
         gp <- rdirichlet(n=i, alpha=rep(x=1, times=kAll))[, 1:k]
         hwp <- as.vector(rdirichlet(n=1, alpha=rep(x=1, times=kAll)))[1:k]
         gpi(gp=gp, hwp=hwp)
       }

