mergeComplexes           package:apComplex           R Documentation

_I_t_e_r_a_t_i_v_e_l_y _c_o_m_b_i_n_e _c_o_l_u_m_n_s _i_n _i_n_i_t_i_a_l _P_C_M_G _e_s_t_i_m_a_t_e

_D_e_s_c_r_i_p_t_i_o_n:

     Repeatedly applies the function 'LCdelta' to make combinations of
     columns in the affiliation matrix representing the protein complex
     membership graph (PCMG) for AP-MS data.

_U_s_a_g_e:

     mergeComplexes(bhmax,adjMat,VBs=NULL,VPs=NULL,simMat=NULL,sensitivity=.75,specificity=.995,Beta=0,commonFrac=2/3,wsVal = 2e7)

_A_r_g_u_m_e_n_t_s:

   bhmax: Initial complex estimates coming from bhmaxSubgraph

  adjMat: Adjacency matrix of bait-hit data from an AP-MS experiment. 
          Rows correspond to baits and columns to hits.

     VBs: 'VBs' is an optional vector of viable baits.

     VPs: 'VPs' is an optional vector of viable prey.

  simMat: An optional square matrix with entries between 0 and 1.  Rows
          and columns correspond to the proteins in the experiment, and
          should be reported in the same order as the columns of
          'adjMat'.  Higher values in this matrix are interpreted to
          mean higher similarity for protein pairs.

sensitivity: Believed sensitivity of AP-MS technology.

specificity: Believed specificity of AP-MS technology.

    Beta: Optional additional parameter for the weight to give data in
          'simMat' in the logistic regression model.

commonFrac: This is the fraction of baits that need to be  overlapping
          for a complex combination to be considered.

   wsVal: A numeric. This is the value assigned to the work-space in
          the call to fisher.test.

_D_e_t_a_i_l_s:

     The local modeling algorithm for AP-MS data described by Scholtens
     and Gentleman (2004) and Scholtens, Vidal, and Gentleman (2005)
     uses a two-component measure of protein complex estimate quality,
     namely P=LxC. Columns in 'cMat' represent individual complex
     estimates.  The algorithm works by starting with a maximal
     BH-complete subgraph estimate of 'cMat', and then improves the
     estimate by combining complexes such that P=LxC  increases.  

     By default 'commonFrac' is set relatively high at 2/3.  This means
      that some potentially reasonable complex combinations could be
     missed. For  smaller data sets, users may consider decreasing the
     fraction.  For larger  data sets, this may cause a large increase
     in computation time.

_V_a_l_u_e:

     A list of character vectors containing the names of the proteins
     in the  estimated complexes.

_A_u_t_h_o_r(_s):

     Denise Scholtens

_R_e_f_e_r_e_n_c_e_s:

     Scholtens D and Gentleman R.  Making sense of high-throughput
     protein-protein interaction data.  Statistical Applications in
     Genetics and Molecular Biology 3, Article 39 (2004).

     Scholtens D, Vidal M, and Gentleman R.  Local modeling of global
     interactome networks.  Bioinformatics 21, 3548-3557 (2005).

_S_e_e _A_l_s_o:

     'bhmaxSubgraph','findComplexes'

_E_x_a_m_p_l_e_s:

     data(apEX)
     PCMG0 <- bhmaxSubgraph(apEX)
     PCMG1 <- mergeComplexes(PCMG0,apEX,sensitivity=.7,specificity=.75)

