vsn2                   package:vsn                   R Documentation

_F_i_t _t_h_e _v_s_n _m_o_d_e_l

_D_e_s_c_r_i_p_t_i_o_n:

     'vsn2' fits the vsn model to the data matrix in an 'ExpressionSet'
     and returns a 'vsn' object with the fit parameters and the
     transformed data matrix. The data matrix contains, typically,
     feature intensity readings from a microarray. There are also
     'vsn2' methods for numeric matrices and vectors. 'predict' applies
     a fitted model to data and returns an 'ExpressionSet' object.
     'justvsn' is a simple wrapper that  takes and returns an
     'ExpressionSet'. These are the main functions of this package. An
     overview is given in the vignette _Introduction to vsn_.

_U_s_a_g_e:

     ## S4 method for signature 'ExpressionSet':
     vsn2(x, reference, strata, ...)

     vsnMatrix(x,
       reference,
       strata,
       lts.quantile = 0.9,
       subsample    = 0L,
       verbose      = interactive(),
       returnData   = TRUE,
       pstart,
       cvg.niter    = 5L,
       cvg.eps      = 5e-3)

     ## S4 method for signature 'matrix':
     vsn2(x, reference, strata, ...)
     ## S4 method for signature 'numeric':
     vsn2(x, reference, strata, ...)

_A_r_g_u_m_e_n_t_s:

       x: An object containing the data to which the model is to be
          fitted. Methods exists for 'ExpressionSet', 'matrix' and
          'numeric'.

reference: Optional, a 'vsn' object from a previous fit. If this
          argument is specified, the data in 'x' are normalized
          "towards" an existing set of reference arrays whose
          parameters are stored in the object 'reference'. If this
          argument is not specified, then the data in 'x' are
          normalized "among themselves". See Details for a more precise
          explanation.

  strata: Optional, a factor whose length is 'nrow(x)'. Can be used for
          stratified normalization (i.e. separate offsets 'a' and
          factors 'b' for each level of 'strata'). If missing, all rows
          of 'x' are assumed to come from one stratum.

lts.quantile: Numeric of length 1. The quantile that is used for the
          resistant least trimmed sum of squares regression. Allowed
          values are between 0.5 and 1. A value of 1 corresponds to
          ordinary least sum of squares regression.

subsample: Integer of length 1. If specified, the model parameters are
          estimated from a subsample of the data of size 'subsample'
          only, yet the fitted transformation is then applied to all
          data. For large datasets, this can substantially reduce the
          CPU time and memory consumption at a negligible loss of
          precision.

 verbose: Logical. If TRUE, some messages are printed.

returnData: Logical. If TRUE, the transformed data are returned in a
          slot of the resulting 'vsn' object. The option to set this
          option to 'FALSE' allows saving of memory if the data are not
          needed.

  pstart: Optional, array. Can be used to specify start values for the
          iterative parameter estimation algorithm. See  'vsn2trsf' for
          a description of the layout of the array.

cvg.niter: Integer. The number of iterations to be used in the least
          trimmed sum of squares regression.

 cvg.eps: Numeric. A convergence treshold.

     ...: Arguments that get passed on to 'vsnMatrix'.

_D_e_t_a_i_l_s:

     If the 'reference' argument is _not_ specified, then the model
     parameters $mu_k$ and $sigma$ are fit from the data in 'x'. This
     is the mode of operation described in the 2002 Bioinformatics
     paper and that was the only option in versions 1.X of this
     package. If 'reference' is specified, the model parameters  $mu_k$
     and $sigma$ are taken from it. This allows for 'incremental'
     normalization. See the vignette _Likelihood Calculations for vsn_.

_V_a_l_u_e:

     An object of class 'vsn'. The transformed data are on a glog scale
     to base 2. More precisely, the transformed data are subject to the
     transformation asinh(a+bx)/log(2)+c, where
     $\mbox{asinh}(x)/log(2)=log_2(x+sqrt{x^2+1})$ is also called the
     'glog', and the constant c is an overall constant offset that is
     computed such that for large x the transformation approximately
     corresponds to the $\log_2$ function. The offset c is
     inconsequential for all differential expression calculations, but
     many users like to see the data in a range that they are familiar
     with.

_A_u_t_h_o_r(_s):

     Wolfgang Huber <URL: http://www.ebi.ac.uk/huber>

_S_e_e _A_l_s_o:

     'justvsn', 'predict'

_E_x_a_m_p_l_e_s:

     data("kidney")

     fit = vsn2(kidney)                   ## fit
     nkid = predict(fit, newdata=kidney)  ## apply fit

     plot(exprs(nkid), pch=".")
     abline(a=0, b=1, col="red")

