RtreemixModel-class         package:Rtreemix         R Documentation

_C_l_a_s_s "_R_t_r_e_e_m_i_x_M_o_d_e_l"

_D_e_s_c_r_i_p_t_i_o_n:

     This class contains all the data needed for characterizing the
     mutagenetic trees mixture model (mixture parameters, mixture
     components, ...). The tree components of the model are given as a
     list of directed 'graphNEL' objects.

_O_b_j_e_c_t_s _f_r_o_m _t_h_e _C_l_a_s_s:

     Objects can be created by calls of the form 'new("RtreemixModel",
     ParentData, Weights, WeightsCI, Resp, CompleteMat, Star, Trees)'.
     The 'RtreemixModel' class extends the 'RtreemixData' class and
     specifies the mutagenetic trees mixture model. If the model is not
     randomly generated the parent class gives the 'RtreemixData' used
     for learning the mixture model. The directed trees that build up
     the model are represented as a list of directed 'graphNEL'
     objects, and their weights (the mixture parameters) are given as a
     numeric vector. This class can also contain other useful
     information connected with the mixture model like confidence
     intervals for the mixture parameters and the edge weights
     (resulting from a bootstrap analysis), an indicator for the
     presence of the star component, etc. They are all listed in the
     text below with brief descriptions.

     The 'ParentData' is an 'RtreemixData' object that specifies the
     data used for estimating the mutagenetic trees mixture model. It
     is not specified for random mixture models, since they are not
     estimated from a given dataset but generated randomly.

     The 'Weights' is a numeric 'vector' that contains the mixture
     parameters of the model. Its length equals the length of the
     'list' of tree components 'Trees'.

     The 'WeightsCI' is a named 'list' with length equal to the length
     of the 'Weights'. Each list element is a numeric 'vector' of
     length two specifying the lower and upper bound of the confidence
     interval for the corresponding mixture parametar. The confidence
     intervals are derived using the bootstrap method.

     The 'Resp' is a numeric 'matrix' that specifies the responsibility
     of each tree component to generate each of the patterns in the
     'ParentData'. The number of rows in 'Resp' is equal to the number
     of trees in the mixture (the length of the list 'Trees') and the
     number of columns equals the number of patients in 'ParentData'.
     For random mixture models it is an empty matrix, since they are
     not estimated from a given dataset.

     The 'CompleteMat' is a binary 'matrix' that specifies the complete
     data in case some measurements for some patients are missing in
     the data used for learning the model (the 'ParentData'). It has
     the same size as the matrix specifying the data in 'ParentData'.
     The missing data are estimated in the EM-algorithm used for
     fitting the mixture model. When there are no missing data in
     'ParentData', or the model is randomly generated the 'CompleteMat'
     is an empty matrix. 

     The 'Star' is an indicator of the presence of a noise (star)
     component and is mostly relevant for models with a single tree
     component, since it is assumed that  mixture models with at least
     two components always have the noise as a first component. It is
     of type 'logical'.

     The 'Trees' is a 'list' of directed 'graphNEL' objects, each for
     every tree component in the mixture model. The genetic events are
     represented as nodes in the graphs. The 'edgeData' of each tree
     can have two attributes: '"weight"' and '"ci"'. Please see the
     help page on 'graph-class' and 'graphNEL-class' in the package
     'graph'. The '"weight"' attribute is for edge weight, i.e. the
     conditional probability that the child event of the edge occured
     given that the parent event already occured. The '"ci"' attribute
     is for the bootstrap confidence intervals for the edge weight (a
     numeric vector with length two).

_S_l_o_t_s:


     '_W_e_i_g_h_t_s': Object of class '"numeric"'. The length of the
          'Weights' must be equal to the length of 'Trees'. 

     '_W_e_i_g_h_t_s_C_I': Object of class '"list"'. The length of the
          'WeightsCI' must be equal to the length of 'Weights'.  

     '_R_e_s_p': Object of class '"matrix"'. The number of rows of 'Resp'
          must be identical to the length of 'Trees', and its number of
          columns to the number of patients in the dataset used for
          estimating the mixture model ('ParentData').

     '_C_o_m_p_l_e_t_e_M_a_t': Object of class '"matrix"'. When specified (when
          there are missing data) the size of the 'CompleteMat' must be
          equal to the size of the matrix used to estimate the model.

     '_S_t_a_r': Object of class '"logical"'. 

     '_T_r_e_e_s': Object of class '"list"'. The length of 'Trees' equals
          the length of 'Weights'.

_E_x_t_e_n_d_s:

     Class '"RtreemixData"', directly.

_M_e_t_h_o_d_s:


     _C_o_m_p_l_e_t_e_M_a_t 'signature(object = "RtreemixModel")': A method used
          for obtaining the complete dataset, in case there were any
          missing measurements for some patients in the dataset used to
          learn the mixture model.

     _R_e_s_p 'signature(object = "RtreemixModel")': A method for obtaining
          the matrix of responsibilities for the trees to generate each
          of the samples in the dataset used for learning the model
          ('ParentData').    

     _S_t_a_r 'signature(object = "RtreemixModel")': A method for checking
          the presence of a noise component in the mixture model
          (informative only for models with one tree component).   

     _T_r_e_e_s 'signature(object = "RtreemixModel")': A method for
          obtaining the tree components of the mixture model as a list
          of directed 'graphNEL' objects. 

     _W_e_i_g_h_t_s 'signature(object = "RtreemixModel")': A method for
          obtaining the mixture parameters (the weights of the trees in
          the model).

     _W_e_i_g_h_t_s<- 'signature(object = "RtreemixModel")': A method for
          replacing the value of the slot 'Weights' with a specified
          'numeric' vector. The components of this vector have to sum
          up to one.

     _W_e_i_g_h_t_s_C_I 'signature(object = "RtreemixModel")': A method for
          obtaining the weights of the mixture parameters. 

     _g_e_t_D_a_t_a 'signature(object = "RtreemixModel")': A method for
          obtaining the 'ParentData' of the mixture model, i.e. the
          data used for learning the model. 

     _g_e_t_T_r_e_e 'signature(object = "RtreemixModel", k = "numeric")': A
          method for obtaining the k-th tree component of the mixture
          model as a directed 'graphNEL' object.

     _n_u_m_T_r_e_e_s 'signature(object = "RtreemixModel")': A method for
          obtaining the number of tree components building up the
          mixture model.

_A_u_t_h_o_r(_s):

     Jasmina Bogojeska

_R_e_f_e_r_e_n_c_e_s:

     Learning multiple evolutionary pathways from cross-sectional data,
     N. Beerenwinkel et al.

_S_e_e _A_l_s_o:

     'RtreemixGPS-class', 'RtreemixStats-class', 'RtreemixData-class',
     'RtreemixSim-class', 'fit-methods', 'bootstrap-methods',
     'generate-methods', 'comp.models', 'comp.trees'

_E_x_a_m_p_l_e_s:

     ## Generate a random RtreemixModel object with 2 components.
     rand.mod <- generate(K = 2, no.events = 9, noise.tree = TRUE, prob = c(0.2, 0.8))
     show(rand.mod)
     plot(rand.mod) ## plot the tree components of the model
     plot(rand.mod, k = 2) ## plot the second component of the model

     ## Draw data from a specified mixture model.
     draws <- sim(model = rand.mod, no.draws = 200)
     show(draws)

     ## Create an RtreemixModel object by fitting model to the drawn data.
     mod <- fit(data = draws, K = 2, equal.edgeweights = TRUE, noise = TRUE)
     show(mod)

     ## See the values of the slots of the RtreemixModel object.
     Weights(mod)
     Resp(mod)
     CompleteMat(mod)
     Star(mod)
     Trees(mod)
     ## See data used for learning the model.
     getData(mod)
     ## See the number of tree components in the mixture model.
     numTrees(mod)
     ## See a specific tree component k.
     getTree(object = mod, k = 2)
     ## See the conditional probabilities assigned to edges of the second tree component.
     edgeData(getTree(object = mod, k = 2), attr = "weight")
     ## See the probability distribution encoded by the model on the set of all possible patterns.
     distr <- distribution(model = mod)
     distr
     ## Get the probabilities.
     distr$probability
     ## See the probability distribution encoded by the model on the set of all possible patterns
     ## calculated for given sampling mode, and input and output parameters.
     distr1 <- distribution(model = mod, sampling.mode = "exponential", sampling.param = 1, output.param = 1)
     distr1

     ## Create a RtreemixModel and analyze its variance with the bootstrap method.
     mod.boot <- bootstrap(data = draws, K = 2, equal.edgeweights = TRUE, B = 100)

     ## See the confidence intervals for the mixture parameters (the weights).
     WeightsCI(mod.boot)
     ## See the confidence intervals of the conditional probabilities assigned to the edges.
     edgeData(getTree(mod.boot, 2), attr = "ci")

