qpPAC                package:qpgraph                R Documentation

_E_s_t_i_m_a_t_i_o_n _o_f _p_a_r_t_i_a_l _c_o_r_r_e_l_a_t_i_o_n _c_o_e_f_f_i_c_i_e_n_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     Estimates partial correlation coefficients (PACs) and their
     corresponding P-values in a given undirected graph, from an input
     data set.

_U_s_a_g_e:

     ## S4 method for signature 'ExpressionSet':
     qpPAC(data, g, return.K=FALSE,
                                     long.dim.are.variables=TRUE, verbose=TRUE,
                                     R.code.only=FALSE)
     ## S4 method for signature 'data.frame':
     qpPAC(data, g, return.K=FALSE,
                                  long.dim.are.variables=TRUE, verbose=TRUE,
                                  R.code.only=FALSE)
     ## S4 method for signature 'matrix':
     qpPAC(data, g, return.K=FALSE,
                              long.dim.are.variables=TRUE, verbose=TRUE,
                              R.code.only=FALSE)

_A_r_g_u_m_e_n_t_s:

    data: data set from where to estimate the partial correlation
          coefficients. It can be an ExpressionSet object, a data frame
          or a matrix.

       g: either a 'graphNEL' object or an incidence matrix of the
          given undirected graph.

return.K: logical; if TRUE this function also returns the concentration
          matrix 'K'; if FALSE it does not return it (default).

long.dim.are.variables: logical; if TRUE it is assumed that when data
          are in a data frame or in a matrix, the longer dimension is
          the one defining the random variables (default); if FALSE,
          then random variables are assumed to be at the columns of the
          data frame or matrix.

 verbose: show progress on the calculations.

R.code.only: logical; if FALSE then the faster C implementation is used
          (default); if TRUE then only R code is executed.

_D_e_t_a_i_l_s:

     The estimation of PACs requires that the sample size 'n' is
     strictly larger than the number of variables 'p'. In the context
     of microarray data and regulatory networks, genes play the role of
     variables and thus normally 'p >> n'. For this reason, we can
     estimate PACs from the edges of a regulatory network represented
     by an undirected graph 'G' if and only if the maximum clique size
     of the graph, noted 'w(G)', is strictly smaller than the sample
     size 'n' (number of experiments in the microarray data context).

     In the context of this package, the undirected graph should
     correspond to a qp-graph (see function 'qpGraph') we have selected
     by thresholding on the (average) non-rejection rate calculated
     from this same data set using the functions 'qpNrr' or 'qpAvgNrr'.
     If the resulting graph is sparse enough we may have a chance to
     meet the requirement of 'w(G) < n' and the function 'qpClique' can
     be useful to investigate this. In the context of transcriptional
     regulatory networks we may consider to remove edges between
     non-transcription factor genes which will substantially increase
     the sparseness of the network.

     The PAC estimation is done by first obtaining a maximum likelihood
     estimate of the sample covariance matrix of the input data set
     using the '{link{qpIPF}' function and the P-values are calculated
     based on the estimation of the standard errors of the edges
     following the procedure by Roverato and Whittaker (1996).

_V_a_l_u_e:

     A list with two matrices, one with the estimates of the PACs and
     the other with their P-values.

_A_u_t_h_o_r(_s):

     R. Castelo and A. Roverato

_R_e_f_e_r_e_n_c_e_s:

     Castelo, R. and Roverato, A. A robust procedure for Gaussian
     graphical model search from microarray data with p larger than n.
     _J. Mach. Learn. Res._, 7:2621-2650, 2006.

     Castelo, R. and Roverato, A. Reverse engineering molecular
     regulatory networks from microarray data with qp-graphs. _J. Comp.
     Biol., accepted_, 2008.

     Roverato, A. and Whittaker, J. Standard errors for the parameters
     of graphical Gaussian models. _Stat. Comput._, 6:297-302, 1996.

_S_e_e _A_l_s_o:

     'qpGraph' 'qpCliqueNumber' 'qpClique' 'qpGetCliques' 'qpIPF'

_E_x_a_m_p_l_e_s:

     nVar <- 50 # number of variables
     maxCon <- 5  # maximum connectivity per variable
     nObs <- 30 # number of observations to simulate

     I <- qpRndGraph(n.vtx=nVar, n.bd=maxCon)
     K <- qpI2K(I)

     X <- qpSampleMvnorm(K, nObs)

     nrr.estimates <- qpNrr(X, verbose=FALSE)

     g <- qpGraph(nrr.estimates, 0.5)

     pac.estimates <- qpPAC(X, g=g, verbose=FALSE)

     # estimated partial correlation coefficients of the present edges
     summary(abs(pac.estimates$R[upper.tri(pac.estimates$R) & I]))

     # estimated partial correlation coefficients of the missing edges
     summary(abs(pac.estimates$R[upper.tri(pac.estimates$R) & !I]))

