readBeadSummaryData        package:beadarray        R Documentation

_R_e_a_d _B_e_a_d_S_t_u_d_i_o _g_e_n_e _e_x_p_r_e_s_s_i_o_n _o_u_t_p_u_t

_D_e_s_c_r_i_p_t_i_o_n:

     Function to read the output of Illumina's BeadStudio software into
     beadarray

_U_s_a_g_e:

     readBeadSummaryData(dataFile, qcFile=NULL, sampleSheet=NULL,
                         sep="\t", skip=8, ProbeID="ProbeID",
                         columns = list(exprs = "AVG_Signal", se.exprs="BEAD_STDERR",
                             NoBeads = "Avg_NBEADS", Detection="Detection Pval"),
                         qc.sep="\t", qc.skip=8, controlID="ProbeID", 
                         qc.columns = list(exprs="AVG_Signal", se.exprs="BEAD_STDERR", 
                             NoBeads="Avg_NBEADS", Detection="Detection Pval"), 
                         annoPkg=NULL, dec=".", quote="")

_A_r_g_u_m_e_n_t_s:

dataFile: character string specifying the name of the file containing
          the  BeadStudio output for each probe on each array in an
          experiment (required).   Ideally this should be the
          'SampleProbeProfile' from BeadStudio.

  qcFile: character string giving the name of the file containing the 
          control probe intensities (optional).  This file should be
          either the  'ControlProbeProfile' or 'ControlGeneProfile'
          from BeadStudio.

sampleSheet: character string used to specify the file containing
          sample infomation (optional)

     sep: field separator character for the 'dataFile' ('"\t"' for  tab
          delimited or '","' for comma separated)

    skip: number of header lines to skip at the top of 'dataFile'.  
          Default value is 8.

 ProbeID: character string of the column in 'dataFile' that contains 
          identifiers that can be used to uniquely identify each probe

 columns: list defining the column headings in 'dataFile' which 
          correspond to the matrices stored in the 'assayData' slot of
          the final 'ExpressionSetIllumina' object

  qc.sep: field separator character for 'qcFile'

 qc.skip: number of header lines to skip at the top of 'qcFile'

controlID: character string specifying the column in 'qcFile' that
          contains  the identifiers that uniquely identify each control
          probe

qc.columns: list defining the column headings in 'qcFile' which 
          correspond to the matrices stored in the 'QCInfo' slot of the
          final 'ExpressionSetIllumina' object

 annoPkg: character string specifying the name of the annotation
          package  (only available for certain expression arrays at
          present)

     dec: the character used in the 'dataFile' and 'qcFile' for decimal
          points

   quote: the set of quoting characters (disabled by default)

_D_e_t_a_i_l_s:

     This function can be used to read gene expression data exported
     from versions 1,2 and 3 of the Illumina BeadStudio application.
     The format of the BeadStudio output will depend on the version
     number. For example, the file may be comma or tab separated of
     have header information at the top of the file. The parameters
     'sep' and 'skip' can be used to adapt the function as required
     (i.e. skip=7 is  appropriate for data from earlier version of
     BeadStudio, and skip=0 is required if header information hasn't
     been exported.

     The format of the BeadStudio file is assumed to have one row for
     each probe sequence in the experiment and a set number of columns
     for each array. The columns which are exported for each array are
     chosen by the  user when running BeadStudio.  At a minimum,
     columns for average intensity standard error, the number of beads
     and detection scores should be exported,  along with a column
     which contains a unique identifier for each bead type  (usually
     named "ProbeID").

     It is assumed that the average bead intensities for each array
     appear in  columns with headings of the form 'AVG_Signal-ARRAY1',
     'AVG_Signal-ARRAY2',...,'AVG_Signal-ARRAYN' for the N arrays found
     in the file.  All other column headings are matched in the same
     way using the character  strings specified in the 'columns'
     argument.

     NOTE:  With version 2 of BeadStudio it is possible to export
     annotation and sequence information along with the intensities. 
     We _don't_ recommend  exporting this information, as special
     characters found in the annotation  columns can cause problems
     when reading in the data.  This annotation information can be
     retrieved later on from other Bioconductor packages.

     The default object created by readBeadSummaryData is an
     'ExpressionSetIllumina' object.

     If the control intensities have been exported from BeadStudio
     ('ControlProbeProfile') this may be read into beadarray as well.
     The 'qc.skip', 'qc.sep' and 'qc.columns' parameters can be  used
     to adjust for the contents of the file.  If the
     'ControlGeneProfile'  is exported, you will need to set
     'controlID="TargetID"'.

     Sample sheet information can also be used. This is a file format
     used by Illumina to specify which sample has been hybridised to
     each array  in the experiment.

     Note that if the probe identifiers are non-unique, the duplicated 
     rows are removed.  This may occur if the 'SampleGeneProfile' is 
     exported from BeadStudio and/or 'ProbeID="TargetID"' is specified 
     (the "ProbeID" column has a unique identifier in the
     'SampleProbeProfile', whereas the "TargetID" may not, as multiple
     beads can target the same  transcript).

_V_a_l_u_e:

     An 'ExpressionSetIllumina' object.

_A_u_t_h_o_r(_s):

     Mark Dunning and Mike Smith

_S_e_e _A_l_s_o:

     'ExpressionSetIllumina'

_E_x_a_m_p_l_e_s:

     ##code to read the example BeadStudio (version 2) output distributed with the package
     #dataFile = "SampleProbeProfile.txt"
     #sampleSheet = "SampleSheet.csv"
     #qcFile = "ControlGeneProfile.txt"
     #BSData =readBeadSummaryData(dataFile, qcFile=qcFile, sampleSheet=sampleSheet, controlID="TargetID")

