06.LinearModels            package:limma            R Documentation

_L_i_n_e_a_r _M_o_d_e_l_s _f_o_r _M_i_c_r_o_a_r_r_a_y_s

_D_e_s_c_r_i_p_t_i_o_n:

     This page gives an overview of the LIMMA functions available to
     fit linear models and to interpret the results. This page covers
     models for two color arrays in terms of log-ratios or for
     single-channel arrays in terms of log-intensities. If you wish to
     fit models to the individual channel log-intensities from two
     colour arrays, see 07.SingleChannel.

     The core of this package is the fitting of gene-wise linear models
     to microarray data. The basic idea is to estimate log-ratios
     between two or more target RNA samples simultaneously. See the
     LIMMA User's Guide for several case studies.

_F_i_t_t_i_n_g _M_o_d_e_l_s:

     The main function for model fitting is 'lmFit'. This is
     recommended interface for most users. 'lmFit' produces a fitted
     model object of class 'MArrayLM' containing coefficients, standard
     errors and residual standard errors for each gene. 'lmFit' calls
     one of the following three functions to do the actual
     computations:

      '_l_m._s_e_r_i_e_s'  Straightforward least squares fitting of a linear
          model for each gene.

      '_m_r_l_m'  An alternative to 'lm.series' using robust regression as
          implemented by the 'rlm' function in the MASS package.

      '_g_l_s._s_e_r_i_e_s'  Generalized least squares taking into account
          correlations between duplicate spots (i.e., replicate spots
          on the same array) or related arrays. The function
          'duplicateCorrelation' is used to estimate the
          inter-duplicate or inter-block correlation before using
          'gls.series'.

     All the functions which fit linear models use 'unwrapdups' which
     provides an unified method for handling duplicate spots.

_F_o_r_m_i_n_g _t_h_e _D_e_s_i_g_n _M_a_t_r_i_x:

     'lmFit' has two main arguments, the expression data and the design
     matrix. The design matrix is essentially an indicator matrix which
     specifies which target RNA samples were applied to each channel on
     each array. There is considerable freedom in choosing the design
     matrix - there is always more than one choice which is correct
     provided it is interpreted correctly.

     Design matrices for Affymetrix or single-color arrays can be
     created using the function 'model.matrix' which is part of the R
     base package. The function 'modelMatrix' is provided to assist
     with creation of an appropriate design matrix for two-color
     microarray experiments. For direct two-color designs, without a
     common reference, the design matrix often needs to be created by
     hand.

_M_a_k_i_n_g _C_o_m_p_a_r_i_s_o_n_s _o_f _I_n_t_e_r_e_s_t:

     Once a linear model has been fit using an appropriate design
     matrix, the command 'makeContrasts' may be used to form a contrast
     matrix to make comparisons of interest. The fit and the contrast
     matrix are used by 'contrasts.fit' to compute fold changes and
     t-statistics for the contrasts of interest. This is a way to
     compute all possible pairwise comparisons between treatments for
     example in an experiment which compares many treatments to a
     common reference.

_A_s_s_e_s_s_i_n_g _D_i_f_f_e_r_e_n_t_i_a_l _E_x_p_r_e_s_s_i_o_n:

     After fitting a linear model, the standard errors are moderated
     using a simple empirical Bayes model using 'ebayes' or 'eBayes'. A
     moderated t-statistic and a log-odds of differential expression is
     computed for each contrast for each gene.

     'ebayes' and 'eBayes' use internal functions 'fitFDist',
     'tmixture.matrix' and 'tmixture.vector'.

     The function 'zscoreT' is sometimes used for computing z-score
     equivalents for t-statistics so as to place t-statistics with
     different degrees of freedom on the same scale. 'zscoreGamma' is
     used the same way with standard deviations instead of
     t-statistics. These functions are for research purposes rather
     than for routine use.

_S_u_m_m_a_r_i_z_i_n_g _M_o_d_e_l _F_i_t_s:

     After the above steps the results may be displayed or further
     processed using:

      '_t_o_p_t_a_b_l_e' _o_r '_t_o_p_T_a_b_l_e'  Presents a list of the genes most
          likely to be differentially expressed for a given contrast.

      '_t_o_p_T_a_b_l_e_F'  Presents a list of the genes most likely to be
          differentially expressed for a given set of contrasts.

      '_v_o_l_c_a_n_o_p_l_o_t' Volcano plot of fold change versus the B-statistic
          for any fitted coefficient.

      '_p_l_o_t_l_i_n_e_s' Plots fitted coefficients or log-intensity values for
          time-course data.

      '_w_r_i_t_e._f_i_t'  Writes an 'MarrayLM' object to a file. Note that if
          'fit' is an 'MArrayLM' object, either 'write.fit' or
          'write.table' can be used to write the results to a delimited
          text file.

     For multiple testing functions which operate on linear model fits,
     see 08.Tests.

_A_u_t_h_o_r(_s):

     Gordon Smyth

_R_e_f_e_r_e_n_c_e_s:

     Smyth, G. K. (2004). Linear models and empirical Bayes methods for
     assessing differential expression in microarray experiments.
     _Statistical Applications in Genetics and Molecular Biology_, *3*,
     No. 1, Article 3. <URL:
     http://www.bepress.com/sagmb/vol3/iss1/art3>

     Smyth, G. K., Michaud, J., and Scott, H. (2005). The use of
     within-array replicate spots for assessing differential expression
     in microarray experiments. Bioinformatics 21(9), 2067-2075.

