RSadvance              package:RankProd              R Documentation

_A_d_v_a_n_c_e_d _R_a_n_k _S_u_m _A_n_a_l_y_s_i_s _o_f _M_i_c_r_o_a_r_r_a_y

_D_e_s_c_r_i_p_t_i_o_n:

     Advance rank sum method to identify differentially  expressed
     genes. It is possible to combine data from  different studies,
     e.g. data sets generated at different  laboratories.

_A_r_g_u_m_e_n_t_s:

    data: the data set that should be analyzed. Every  row of this data
          set must correspond to a gene.

      cl: a vector containing the class labels of the  samples. In the
          two class unpaired case, the label  of a sample is either 0
          (e.g., control group) or 1  (e.g., case group). For one group
          data, the label for  each sample should be 1.

  origin: a vector containing the origin labels of the  sample. e.g.
          for  the data sets generated at multiple laboratories, the
          label is the same for samples within one lab and different
          for samples  from different labs. 

num.perm: number of permutations used in the calculation  of the null
          density. Default is 'B=100'.

  logged: if "TRUE", data has bee logged, otherwise set  it to "FALSE"

   na.rm: if 'FALSE' (default), the NA value will not be used in
          computing rank. If 'TRUE', the missing  values will be
          replaced by the genewise mean of the non-missing values. Gene
          will all value missing  will be assigned "NA"

gene.names: if "NULL", no gene name will be attached  to the estimated
          percentage of false prediction (pfp). 

    plot: If "TRUE", plot the estimated pfp verse the rank  of each
          gene

    rand: if specified, the random number generator  will be put in a 
          reproducible state.

_V_a_l_u_e:

     A result of identifying differentially expressed  genes between
     two classes. The identification consists of two parts, the
     identification of  up-regulated  and down-regulated genes in class
     2 compared to class 1, respectively. 

     pfp: estimated percentage of false positive predictions (pfp) up
          to  the position of each gene under two  identificaiton each

     RSs: Origina rank-sum (average rank) of each genes

  RSrank: rank of the rank sum of each gene in ascending order

 Orirank: original ranks in each comparison, which is  used to compute
          rank sum

   AveFC: fold change of average expression under class 1 over  that
          under class 2, if multiple origin, than avraged  across all
          origin. log-fold change if data is in log scaled,  original
          fold change if data is unlogged. 

_N_o_t_e:

     Percentage of false prediction (pfp), in theory, is  equivalent of
     false  discovery rate (FDR), and it is  possible to be large than
     1.

     The function looks for up- and down- regulated genes in two
     seperate steps, thus two pfps are computed and used to identify 
     gene that belong to each group. 

     The function is able to deal with single or multiple-orgin 
     studies. It is similar to  funcion RP.advance expect a rank sum is
     computed instead of rank product. This method  is more sensitive
     to individual rank values, while rank  product is more robust to 
     outliers (refer RankProd vignette for details)

_A_u_t_h_o_r(_s):

     Fangxin Hong fhong@salk.edu

_S_e_e _A_l_s_o:

     'topGene'   'RP'   'plotRP'  'RPadvance'

_E_x_a_m_p_l_e_s:

           
           #Suppose we want to check the consistence of the data 
           #sets generated in two different 
           #labs. For example, we would look for genes that were \
           # measured to be up-regulated in 
           #class 2 at lab 1, but down-regulated in class 2 at lab 2.\
            data(arab)
           arab.cl2 <- arab.cl

           arab.cl2[arab.cl==0 &arab.origin==2] <- 1

           arab.cl2[arab.cl==1 &arab.origin==2] <- 0

           arab.cl2
       ##[1] 0 0 0 1 1 1 1 1 0 0

           #look for genes differentially expressed
           #between hypothetical class 1 and 2
           arab.sub=arab[1:500,] ##using subset for fast computation
           arab.gnames.sub=arab.gnames[1:500]
           Rsum.adv.out <- RSadvance(arab.sub,arab.cl2,arab.origin,
                               num.perm=100,
     logged=TRUE,
                               gene.names=arab.gnames.sub,rand=123)

           attributes(Rsum.adv.out)
           

