xval-methods          package:MLInterfaces          R Documentation

_s_u_p_p_o_r_t _f_o_r _c_r_o_s_s-_v_a_l_i_d_a_t_o_r_y _m_a_c_h_i_n_e _l_e_a_r_n_i_n_g _w_i_t_h _e_x_p_r_S_e_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     support for cross-validatory machine learning with exprSets

_U_s_a_g_e:

     xval( data, classLab, proc, xvalMethod, group, indFun, niter,
     fsFun=NULL, fsNum=NULL, decreasing=TRUE, cluster=NULL, ... )
     balKfold(K)

_A_r_g_u_m_e_n_t_s:

    data: instance of class 'exprSet'

classLab: character string identifying phenoData variable to be
          regarded

    proc: an MLInterfaces method that returns an instance of
          'classifOutput'

xvalMethod: character string identifying cross-validation procedure to
          use: default is "LOO" (leave one out), alternatives are "LOG"
          (leave group out) and "FUN" (user-supplied partition
          extraction function, see Details below)

   group: a vector (length equal to number of samples) enumerating
          groups for LOG xval method

  indFun: a function that returns a set of indices to be saved as a
          test set; this function must have parameters 'data', 'clab',
          'iternum'; see Details

   niter: number of iterations for user-specified partition function to
          be  run

   fsFun: function computing ranks of features for feature selection

   fsNum: number of features to be kept for learning in each iteration

decreasing: logical, should be TRUE if 'fsFun' provides high scores for
          high-performing features (e.g., is absolute value of a test
          statistics) and false if it provides low scores for
          high-performing features (e.g., p-value of a test).

 cluster: NULL or an S4-class object with a defined 'xvalLoop' method.
          Use this to execute 'xval' on several nodes in a computer
          cluster. See documentation for 'xvalLoop' for more
          information

     ...: arguments passed to the MLInterfaces generic 'proc'

       K: number of partitions to be used if 'balKfold' is used as
          'indFun'

_D_e_t_a_i_l_s:

     If 'xvalMethod' is '"FUN"', then 'indFun' must be a function with
     parameters 'data', 'clab', and 'iternum'. This function returns
     indices that identify the training set for a given
     cross-validation iteration passed as the value of 'iternum'.  An
     example function is printed out when the example of this page is
     executed.

     if 'fsFun' is not 'NULL', then it must be a function with two
     arguments: the first can be transformed to a feature matrix (rows
     are objects, columns are features) and the second is a vector of
     class labels. The function returns a vector of scores, one for
     each object.  The scores will be interpreted according to the
     value of 'decreasing', to select 'fsNum' features.  Thanks to
     Stephen Henderson of University College London for this
     functionality.

_E_x_a_m_p_l_e_s:

     library(golubEsets)
     data(golubMerge)
     smallG <- golubMerge[200:250,]
     lk1 <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOO", group=as.integer(0))
     table(lk1,smallG$ALL.AML)
     lk2 <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOG", group=as.integer(
      rep(1:8,each=9)))
     table(lk2,smallG$ALL.AML)
     balKfold
     lk3 <- xval(smallG, "ALL.AML", knnB, xvalMethod="FUN", 0:0, indFun=balKfold(5), niter=5)
     table(lk3, smallG$ALL.AML)
     #
     # illustrate the xval FUN method in comparison to LOO
     #
     LOO2 <- xval(smallG, "ALL.AML", knnB, "FUN", 0:0, function(x,y,i) {
       (1:ncol(exprs(x)))[-i] }, niter=72 )
     table(lk1, LOO2)
     #
     # use Stephen Henderson's feature selection extensions
     #
     t.fun<-function(data, fac)
     {
             require(genefilter)
             # deal with the integer storage of golubTrain@exprs!
             xd <- matrix(as.double(exprs(data)), nrow=nrow(exprs(data)))
             return(abs(rowttests(xd,data[[fac]], tstatOnly=FALSE)$statistic))
     }
     lk3f <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOO", 0:0, fsFun=t.fun)
     table(lk3f$out, smallG$ALL.AML)

