glad                  package:GLAD                  R Documentation

_A_n_a_l_y_s_i_s _o_f _a_r_r_a_y _C_G_H _d_a_t_a

_D_e_s_c_r_i_p_t_i_o_n:

     This function allows the detection of breakpoints in genomic
     profiles obtained by array CGH technology and affects a status
     (gain, normal or lost) to each BAC.

_U_s_a_g_e:

     glad.profileCGH(profileCGH, smoothfunc="aws", base=FALSE, sigma,
                        bandwidth=10, round=2, lambdabreak=8, lambdacluster=8,
                        lambdaclusterGen=40, type="tricubic",
                        param=c(d=6),alpha=0.001, method="centroid", nmax=8, verbose=FALSE, ...)

_A_r_g_u_m_e_n_t_s:

profileCGH: Object of class 'profileCGH'

smoothfunc: Type of algorithm used to smooth 'LogRatio' by a piecewise
          constant function. Choose either 'aws' or 'laws'.

    base: If TRUE, the position of BAC is the physical position onto
          the chromosome, otherwise the rank position is used.

   sigma: Value to be passed to either argument 'sigma2'    of' aws'
          function or 'shape' of 'laws'. If 'NULL', sigma is calculated
          from the data.

bandwidth: Set the maximal bandwidth 'hmax' in the 'aws' or  'laws'
          function. For example, if 'bandwidth=10' then the 'hmax'
          value is set to 10*X_N where X_N is the position of the last
          BAC.

   round: The smoothing results of either 'aws' or 'laws' function are
          rounded or not depending on the 'round' argument. The 'round'
          value is passed to the argument 'digits' of the 'round'
          function.

lambdabreak: Penalty term (lambda') used during the  "Optimization of
          the number of breakpoints" step.

lambdacluster: Penalty term (lambda*) used during the "MSHR clustering
          by chromosome" step.

lambdaclusterGen: Penalty term (lambda*) used during the "HCSR
          clustering throughout the genome" step.

    type: Type of kernel function used in the penalty term during the
          "Optimization of the number of breakpoints" step, the "MSHR
          clustering by chromosome" step and the "HCSR clustering
          throughout the genome" step.

   param: Parameter of kernel used in the penalty term.

   alpha: Risk alpha used for the "Outlier detection" step.

  method: The agglomeration method to be used during the "MSHR
          clustering by chromosome" and the "HCSR clustering throughout
          the genome" clustering steps.

    nmax: Maximum number of clusters (N*max) allowed during the the
          "MSHR clustering by chromosome" and the "HCSR clustering
          throughout the genome" clustering steps.

 verbose: If 'TRUE' some information are printed

     ...: parameters to be passed to 'chrBreakpoints' function.
          Typically, you will have to specify the following arguments :
          'lkern="exponential", model="Gaussian", qlambda=0.999'. 

_D_e_t_a_i_l_s:

     The function 'glad' implements the methodology which is described
     in the article : Analysis of array CGH data: from signal ratio to
     gain and loss of DNA regions (Hup et al., 2004 submitted).

     First, 'chrBreakpoints' detects breakpoints and 'detectOutliers'
     allows the detection of MAD outliers. Then, the number of
     breakpoints is optimized with 'removeBreakpoints'.  The two-step
     clustering ("MSHR clustering by chromosome" and the "HCSR
     clustering throughout the genome") is performed with
     'findCluster'. The function 'affectationGNL' give a status to each
     BAC.

_V_a_l_u_e:

Smoothing: Smoothing results of either 'aws' or 'laws' function after
          being rounded or not depending on the 'round' argument.

Breakpoints: The last position of a region with identical amount of DNA
          is flagged by 1 otherwise it is 0. Note that during the
          "Optimization of the number of breakpoints" step, removed
          breakpoints are flagged by -1.

  Region: Each position between two breakpoints are labelled the same
          way with an integer value starting from one. The label is
          incremented by one when a new breakpoints occurs or when
          moving to the next chromosome. The variable 'region' is what
          we call MSHR.

   Level: Each position with equal smoothing value are labelled the
          same way with an integer value starting from one. The label
          is incremented by one when a new level occurs or when moving
          to the next chromosome.

OutliersAws: Each AWS outliers are flagged by -1 (if it is in the
          alpha/2 lower tail of the distribution) or 1 (if it is in the
          alpha/2 upper tail of the distribution) otherwise  it is 0.

OutliersMad: Each MAD outliers are flagged by -1 (if it is in the
          alpha/2 lower tail of the distribution) or 1 (if it is in the
          alpha/2 upper tail of the distribution) otherwise  it is 0.

OutliersTot: OutliersAws + OutliersMad.

 ZoneChr: Clusters identified after MSHR (i.e. 'Region') clustering by
          chromosome.

 ZoneGen: Clusters identified after HCSR clustering throughout the
          genome.

 ZoneGNL: Status of each BAC : Gain is coded by 1, Loss by -1 and
          Normal by 0.

_A_u_t_h_o_r(_s):

     Philippe Hup, Philippe.Hupe@curie.fr.

_S_e_e _A_l_s_o:

     'chrBreakpoints', 'removeBreakpoints','detectOutliers',
     'findCluster',  'affectationGNL'.

_E_x_a_m_p_l_e_s:

     data(snijders)
     profileCGH <- list(profileValues=gm13330)
     class(profileCGH) <- "profileCGH"

     res <- glad(profileCGH, smoothfunc="laws", base=FALSE,
                    bandwidth=10, round=2, lambdabreak=8, lambdacluster=8,
                    lambdaclusterGen=40, alpha=0.001, method="centroid",
                    nmax=8, lkern="exponential", model="Gaussian",
                    qlambda=0.999)

     # color code for region status

     col <- rep("yellow",length(res$profileValues$PosOrder))
     col[which(res$profileValues$ZoneGNL==-1)] <- "green"
     col[which(res$profileValues$ZoneGNL==1)] <- "red"

     # outliers

     outliers <- rep(20,length(res$profileValues$PosOrder))
     outliers[which(res$profileValues$OutliersTot!=0)] <- 13

     plot(LogRatio ~ PosOrder, data=res$profileValues, col=col, pch=outliers)

     # Limit between chromosomes

     LimitChr <- unique(res$profileValues$LimitChr)+0.5
     abline(v=LimitChr, col="grey", lty=2)

     lines(res$profileValues$Smoothing ~ res$profileValues$PosOrder, col="black")

     # Breakpoints identified

     indexBP <- which(res$profileValues$Breakpoints==1)
     BP <- res$profileValues$PosOrder[indexBP]+0.5
     abline(v=BP, col="red", lty=2)

