robustPca         package:pcaMethods         R Documentation(latin1)

_P_C_A _i_m_p_l_e_m_e_n_t_a_t_i_o_n _b_a_s_e_d _o_n _r_o_b_u_s_t_S_v_d

_D_e_s_c_r_i_p_t_i_o_n:

     This is a PCA implementation robust to outliers in a data set. It
     can also handle missing values, it is however NOT intended to be
     used for missing value estimation. As it is based on robustSVD we
     will get an accurate estimation for the loadings also for
     incomplete data or for data with outliers. The returned scores
     are, however, affected by the outliers as they are calculated
     inputData X loadings. This also implies that you should look at
     the returned R2/R2cum values with caution. If the data show
     missing values, scores are caluclated by just setting all NA -
     values to zero. This is not expected to produce accurate results.
     Please have also a look at the manual page for 'robustSvd'.

     Thus this method should mainly be seen as an attempt to integrate
     'robustSvd()' into the framework of this package. Use one of the
     other methods coming with this package (like PPCA or BPCA) if you
     want to do missing value estimation.

     It is not recommended to use this function directely but rather to
     use the pca() wrapper function.

_U_s_a_g_e:

       robustPca(Matrix, nPcs = 2, center = TRUE, completeObs = FALSE, verbose = interactive(), ... )

_A_r_g_u_m_e_n_t_s:

  Matrix: 'matrix' - Data containing the variables in columns and
          observations in rows. The data may contain missing values,
          denoted as 'NA'.

    nPcs: 'numeric' - Number of components to estimate. The preciseness
          of the missing value estimation depends on the number of
          components, which should resemble the internal structure of
          the data.

  center: 'boolean' Mean center the data if TRUE

completeObs: 'boolean' Return the complete observations if TRUE. This
          is the original data with NA values filled with the estimated
          values. Please note that robustPca was NOT designed for
          missing value estimation. Use one of the other pca methods,
          like e.g. BPCA, for missing value estimation!

 verbose: 'boolean' Print some output to the command line if TRUE

     ...: Reserved for future use. Currently no further parameters are
          used.

_D_e_t_a_i_l_s:

     The method is very similar to the standard 'prcomp()' function.
     The main difference is that 'robustSvd()' is used instead of the
     conventional 'svd()' method.

_V_a_l_u_e:

  pcaRes: Standart PCA result object used by all PCA-based methods of
          this package. Contains scores, loadings, data mean and more.
          See 'pcaRes' for details.

_A_u_t_h_o_r(_s):

     Wolfram Stacklies 
      CAS-MPG Partner Institute for Computational Biology, Shanghai,
     China. 
      wolfram.stacklies@gmail.com 

_S_e_e _A_l_s_o:

     'robustSvd, svd, prcomp, pcaRes'.

_E_x_a_m_p_l_e_s:

     ## Load a complete sample metabolite data set and mean center the data
     data(metaboliteDataComplete)
     mdc <- scale(metaboliteDataComplete, center=TRUE, scale=FALSE)
     ## Now create 5% of outliers.
     cond   <- runif(length(mdc)) < 0.05;
     mdcOut <- mdc
     mdcOut[cond] <- 10

     ## Now we do a conventional PCA and robustPca on the original and the data
     ## with outliers.
     ## We use center=FALSE here because the large artificial outliers would
     ## affect the means and not allow to objectively compare the results.
     resSvd    <- pca(mdc, method = "svd", nPcs = 10, center = FALSE)
     resSvdOut <- pca(mdcOut, method = "svd", nPcs = 10, center = FALSE)
     resRobPca <- pca(mdcOut, method = "robustPca", nPcs = 10, center = FALSE)

     ## Now we plot the results for the original data against those with outliers
     ## We can see that robustPca is hardly effected by the outliers.
     plot(resSvd@loadings[,1], resSvdOut@loadings[,1])
     plot(resSvd@loadings[,1], resRobPca@loadings[,1])

