readDesign               package:HELP               R Documentation

_R_e_a_d _N_i_m_b_l_e_G_e_n _d_e_s_i_g_n _f_i_l_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Function to extract array design information from corresponding
     files in the Nimblegen .ndf and .ngd formats.

_U_s_a_g_e:

     readDesign(x, y, z, ...)

_A_r_g_u_m_e_n_t_s:

       x: path to the Nimblegen design file (.ndf). Each line of the
          file is interpreted as a single spot on the array design. If
          it does not contain an absolute path, the file name is
          relative to the current working directory, 'getwd()'.
          Tilde-expansion is performed where supported.  Alternatively,
          'x' can be a readable connection (which will be opened for
          reading if necessary, and if so closed at the end of the
          function call).   'file' can also be a complete URL. 

       y: path to the Nimblegen gene descriptions file (.ngd). Each
          line of the file is interpreted as a single locus. If it does
          not contain an absolute path, the file name is relative to
          the current working directory, 'getwd()'. Tilde-expansion is
          performed where supported.  Alternatively, 'y' can be a
          readable connection (which will be opened for reading if
          necessary, and if so closed at the end of the function call).
            'file' can also be a complete URL. 

       z: object in which to store design information from files. Can
          be an 'ExpressionSet', in which case design information will
          be stored in 'featureData'.  

     ...: Arguments to be passed to methods (see 'readDesign-methods'):

          '_p_a_t_h' a character vector containing a single full path name
               to which filenames will be appended. If 'NULL',
               filenames ('x' and 'y') are treated as is. 

          '_c_o_m_m_e_n_t._c_h_a_r' character: a character vector of length one
               containing a single character or an empty string
               (default is '"#"'). Use '""' to turn off the
               interpretation of comments altogether. 

          '_s_e_p' the field separator character (default is '"\t"').
               Values on each line of the file are separated by this
               character. If 'sep = ""' the separator is "white space",
               that is one or more spaces, tabs, newlines or carriage
               returns. 

          '_q_u_o_t_e' the set of quoting characters (default is '"\""'). To
               disable quoting altogether, use 'quote = ""'. See 'scan'
               for the behavior on quotes embedded in quotes. Quoting
               is only considered for columns read as character, which
               is all of them unless 'colClasses' is specified. 

          '_e_S_e_t' 'ExpressionSet' input (default is
               'new("ExpressionSet")') in which to store design
               information in 'featureData' 

          '...' other arguments to be passed to 'read.table'. See
               'read.table'. 

_V_a_l_u_e:

     Returns an 'ExpressionSet' filled with 'featureData' containing
     the following 'featureColumns': 

'SEQ_ID': a vector of characters with container IDs, linking each probe
          to a parent identifier

'PROBE_ID': a vector of characters containing unique ID information for
          each probe

     'X': vector of numerical data determining x-coordinates of probe
          location on chip

     'Y': vector of numerical data determining y-coordinates of probe
          location on chip

  'TYPE': a vector of characters defining the type of probe, e.g.
          random background signals ('"RAND"') or usable data
          ('"DATA"').

   'CHR': a matrix of characters containing unique ID and chromosomal
          positions for each container

 'START': a matrix of characters containing unique ID and chromosomal
          positions for each container

  'STOP': a matrix of characters containing unique ID and chromosomal
          positions for each container

  'SIZE': a matrix of characters containing unique ID and chromosomal
          positions for each container

'SEQUENCE': a vector of characters containing sequence information for
          each probe

  'WELL': a vector of characters containing multiplex well location for
          each probe (if present in design files)

_A_u_t_h_o_r(_s):

     Reid F. Thompson (rthompso@aecom.yu.edu)

_S_e_e _A_l_s_o:

     'readDesign-methods', 'read.table'

_E_x_a_m_p_l_e_s:

     #demo(pipeline, package="HELP")

     chr <- rep("chr1", 500)
     start <- (1:500)*200
     stop <- start+199
     x <- 1:500
     seqids <- sample(1:50, size=500, replace=TRUE)
     cat("#COMMENT\nSEQ_ID\tCHROMOSOME\tSTART\tSTOP\n", file="./read.design.test.ngd")
     table.ngd <- cbind(seqids, chr, start, stop)
     write.table(table.ngd, file="./read.design.test.ngd", append=TRUE, col.names=FALSE, row.names=FALSE, quote=FALSE, sep="\t")
     cat("#COMMENT\nSEQ_ID\tX\tY\tPROBE_ID\tCONTAINER\tPROBE_SEQUENCE\tPROBE_DESIGN_ID\n", file="./read.design.test.ndf")
     sequence <- rep("NNNNNNNN", 500)
     table.ndf <- cbind(seqids, x, x, x, x, sequence, x)
     write.table(table.ndf, file="./read.design.test.ndf", append=TRUE, col.names=FALSE, row.names=FALSE, quote=FALSE, sep="\t")
     x <- readDesign("./read.design.test.ndf", "./read.design.test.ngd")
     seqids[1:10]
     pData(featureData(x))$"SEQ_ID"[1:10]

     #rm(table.ngd, table.ndf, chr, start, stop, x, seqids, sequence)
     #file.remove("./read.design.test.ngd")
     #file.remove("./read.design.test.ndf")

