import                 package:MANOR                 R Documentation

_I_m_p_o_r_t _r_a_w _f_i_l_e _t_o _a_n _a_r_r_a_y_C_G_H _o_b_j_e_c_t

_D_e_s_c_r_i_p_t_i_o_n:

     Load raw data from a text file coming from image analysis and
     convert it to an 'arrayCGH' object, using additional information
     about the array design.

     Supported file types are Genepix Results file (.gpr), outputs from
     SPOT, or any text file with appropriate fields "Row" and "Column"

_U_s_a_g_e:

       import(file, var.names=NULL, spot.names=NULL, clone.names=NULL, type=c("default", "gpr", "spot"), id.rep=1, design=NULL, add.lines=FALSE, ...)

_A_r_g_u_m_e_n_t_s:

    file: a connection or character string giving the name of the file
          to import.

var.names: a vector of variables names used to compute the array
          design. If default is not overwritten, it is set to
          c("Block", "Column", "Row", "X", "Y") for gpr files,
          c("Arr.colx", "Arr.rowy", "Spot.colx", "Spot.rowy") for SPOT
          files, and c("Col", "Row") for other text files 

spot.names: a list with spot-level variable names to be added to
          'arrayCGH$arrayValues'

clone.names: a list with clone-level variable names to be added to
          'arrayCGH$cloneValues' (only used in case of within-slide
          replicates)

    type: a character value specifying the type of input file:
          currently .gpr files ("gpr"), spot files ("spot") and other
          text files with fields 'Col' and 'Row' ("default") are
          supported

  id.rep: index of the replicate identifier (e.g. the name of the
          clone) in the vector(clone.names)

  design: a numeric vector of length 4 specifying array design as
          number of blocks per row, number of blocks per column, number
          of rows per block, number of columns by block. This field is
          optional for "gpr" files and "default" text files, and not
          used for "SPOT" files

add.lines: boolean value to handle the case when array design does not
          match number of lines. If TRUE, empty lines are added; if
          FALSE, execution is stopped

     ...: additional import parameters (e.g. ''sep='', or
          ''comment.char='', to be passed to read.delim function. Note
          that argument 'as.is=TRUE' is always passed to read.delim, in
          order to avoid unapropriate conversion of character vectors
          to factors

_D_e_t_a_i_l_s:

     Mandatory elements of 'arrayCGH' objects are the array design and
     the x and y _absolute coordinates_ of each spot on the array.
     Output files from SPOT contain x and y relative coordinates of
     each spot within a block, as well as block coordinates on the
     array; one can therefore easily construct te corresponding
     'arrayCGH' object.

     .gpr files currently only contain x and y relative coordinates of
     each spot within a block, and block index with no specification of
     the  spatial block design: if block design is not specified by
     user, we compute it using the real pixel locations of each spot
     ('X' and 'Y' variables in usual .gpr files)  

     If clone.names is provided, an additional data frame is created
     with clone-level information (e.g. clone names, positions, 
     chromosomes, quality marks), aggregated from array-level
     information using the identifier specified by id.rep. This
     identifier is also added to the 'arrayCGH' object created, with
     name 'id.rep'.

     Due to space limitations, only the first 100 lines of sample 'gpr'
     and 'spot' files are given in the standard distribution of
     'MANOR'. Complete files are available at <URL:
     http://bioinfo.curie.fr/projects/manor/index.html>

_V_a_l_u_e:

     an object of class 'arrayCGH'

_N_o_t_e:

     People interested in tools for array-CGH analysis can visit our
     web-page: <URL: http://bioinfo.curie.fr>.

_A_u_t_h_o_r(_s):

     Pierre Neuvial, manor@curie.fr.

_S_e_e _A_l_s_o:

     'arrayCGH'

_E_x_a_m_p_l_e_s:

     dir.in <- system.file("data", package="MANOR")

     ## import from 'spot' files
     spot.names <- c("LogRatio", "RefFore", "RefBack", "DapiFore", "DapiBack", "SpotFlag", "ScaledLogRatio")
     clone.names <- c("PosOrder", "Chromosome")
     edge <- import(paste(dir.in, "/edge.txt", sep=""), type="spot",
     spot.names=spot.names, clone.names=clone.names, add.lines=TRUE) 

     ## import from 'gpr' files
     spot.names <- c("Clone", "FLAG", "TEST_B_MEAN", "REF_B_MEAN",
     "TEST_F_MEAN", "REF_F_MEAN", "ChromosomeArm")  
     clone.names <- c("Clone", "Chromosome", "Position", "Validation")
       
     ac <- import(paste(dir.in, "/gradient.gpr", sep=""), type="gpr",
     spot.names=spot.names, clone.names=clone.names, sep="\t", 
     comment.char="@", add.lines=TRUE) 

