normalize          package:diffGeneAnalysis          R Documentation

_N_o_r_m_a_l_i_z_a_t_i_o_n _o_f _m_i_c_r_o_a_r_r_a_y _d_a_t_a

_D_e_s_c_r_i_p_t_i_o_n:

     Normalization of data utilizing information obtained from
     background fluoresence.Background fluoresce intensity values are
     used to determine a Gaussian distribution of lowly expressed
     genes,yielding the background estimates(mean and standard
     deviation).

_U_s_a_g_e:

     normalize(rawdata, numSlides,ctrl,expm,ctrlbg,expmbg)

_A_r_g_u_m_e_n_t_s:

 rawdata: rawdata is matrix of microarray data.The first column
          consists of gene Names and the first row consists of headers.

numSlides: numSlides represents the total number of chips/slides in the
          microarray dataset including control and experiment.Control
          slides are always followed by experiment slides from left to
          right in the matrix.

    ctrl: ctrl represents the total number of control chips in the
          microarray dataset.

    expm: expm represents the total number of experiment chips in the
          microarray dataset.

  ctrlbg: ctrlbg represents the percent of data to pick for background
          computation of the control chips.30 percent is the default.

  expmbg: expmbg represents the percent of data to pick for background
          computation of the experiment chips.30 percent is the
          default.

_D_e_t_a_i_l_s:

     The normalization algorithm trims the data based on innitial
     emperical estimates of the mean and standard deviation.All data
     beyond +/-2SD of the mean are cut iteratively.This procedure is
     repeated until no more cuts can be made.The trimmed data is then
     subjected to a non linear curve fitting procedure. The user is
     presented with six different pictures obtained using bars
     2,3,4,4.5,5,and5.5 as mean. The user is given the freedom to
     select the best visual estimate of background. The user selected
     parameters are used to perform a z-Trnasformation on the data.The
     percent of data selected to compute background depends on the data
     obtained.The default is 30 percent.A normal distributed histogram
     should confirm that, else the user is allowed to pick a  percent
     and make changes until the user sees a normal distributed
     histogram.Upon running normalize the user is presented with a set
     of 6 histograms. If the user is not happy with the default 30
     percent, the user should go ahead and select a mean and confirm
     curvefit,then select 'no' to confirm histogram distribution.The
     user will be presented with a new set of 6 histograms. This
     process is repeated until the user selects the best Histogram
     distribution.This process is repeated for each individual chip.

_V_a_l_u_e:

     A matrix of normalized values of rawdata

_A_u_t_h_o_r(_s):

     Choudary L Jagarlamudi

_R_e_f_e_r_e_n_c_e_s:

     Dozmorov I,Centola,M. An associative analysis of gene expression
     array data. Bioinformatics.2003 Jan22;19(2):204-11

     Knowlton N,Dozmorov I, Centola M. Microarray data Analysis Tool
     box(MDAT): for normalization,adjustment and analysis of gene
     expression data. Bioonformatics.2004 Dec 12;20(18):3687-90

_E_x_a_m_p_l_e_s:

     #rawdata is loaded in the package. Run example as follows:
     #Read the description file for best results.
     #data(rawdata)
     #normalize(rawdata,7,3,4,0.15,0.60)

