mergeSAGE             package:SAGElyzer             R Documentation

_F_u_n_c_t_i_o_n_s _t_o _m_e_r_g_e _S_A_G_E _l_i_b_r_a_r_i_e_s _b_a_s_e_d _o_n _u_n_i_q_u_e _S_A_G_E _t_a_g_s

_D_e_s_c_r_i_p_t_i_o_n:

     These functions merge individual SAGE libraries based on unique
     SAGE tags and write the merged data into a file and a table in a
     database with the unique SAGE tags as one column and counts from
     all the libraries as the others.

_U_s_a_g_e:

     mergeSAGE(libNames, isDir = TRUE,  skip = 1, pattern = ".sage")
     getLibInfo <- function(fileNames

_A_r_g_u_m_e_n_t_s:

libNames: 'libNames' - a vector of character strings for the name of
          the SAGE libraries to be merged. 'libNames' can be the name
          of the directory containing SAGE libraries to be merged

   isDir: 'isDir' - a boolean that is TRUE if libNames is the name for
          the directory that contains SAGE libraries to be merged

    skip: 'skip' - an integer for the number of lines to be skiped when
          the libraries are merged

 pattern: 'pattern' - a character string for the pattern to be used to
          get the file SAGE data files from the directory when
          'libNames' is for a directory. Only files that match the
          pattern will be merged

_D_e_t_a_i_l_s:

     Each SAGE library typically contains two columns with the first
     one being SAGE tags and the second one being their counts.
     'mergeSAGE' merges library files based on the tags. Tags that are
     missing from a given library but exist in other will be assigned
     0s for the library. 

     'mergeSAGE' will generate two files. One contains the merged data
     and the other contains four columns with the first one being the
     column names of the database table to store the SAGE counts, the
     second one being the original SAGE library names, the third being
     the normalization factor that will be used to normalize counts
     based on the library with the smallest number of tags, and the
     forth being the factor based on the library with the largest
     number of tag.

     'getLibInfo' creates the file that contains the information about
     the data file.

     'calNormFact' calculates the normalization factor.

_V_a_l_u_e:

     'mergeSAGE' returns a list containing two file names 

    data: a character string for the name of the file containing the
          merged data

    info: a character string for the name of the file containing
          information about the merged data


     'getLibInfo' returns a matrix with four columns.

_N_o_t_e:

     The functions are part of the Bioconductor project at Dana-Farber
     Cancer Institute to provide Bioinformatics functionalities through
     R

_A_u_t_h_o_r(_s):

     Jianhua Zhang

_R_e_f_e_r_e_n_c_e_s:

     <URL: http://www.ncbi.nlm.nih.gov/geo>

_S_e_e _A_l_s_o:

     'SAGELyzer'

_E_x_a_m_p_l_e_s:

     path <- tempdir()
     # Create two libraries
     lib1 <- cbind(paste("tag", 1:10, sep = ""), 1:10)
     lib2 <- cbind(paste("tag", 5:9, sep = ""), 15:19)
     write.table(lib1, file = file.path(path, "lib1.sage"), sep = "\t",
     row.names = FALSE, col.names = FALSE)
     write.table(lib2, file = file.path(path, "lib2.sage"), sep = "\t",
     row.names = FALSE, col.names = FALSE) 
     libNNum <- getLibNNum(c(file.path(path, "lib1.sage"),
     file.path(path, "lib2.sage")))
     normFact <- calNormFact("min", libNNum)
     uniqTag <- getUniqTags(c(file.path(path, "lib1.sage"),
     file.path(path, "lib2.sage")), skip = 0)

