SAGELyzer             package:SAGElyzer             R Documentation

_F_u_n_c_t_i_o_n _t_o _f_i_l_t_e_r _o_u_t _t_h_e _k _n_e_a_r_e_s_t _n_e_i_g_h_b_o_r_s _f_o_r _a _g_i_v_e_n _t_a_g

_D_e_s_c_r_i_p_t_i_o_n:

     This function finds the k nearest neighbors for a given SAGE tag
     based on the expression of SAGE tags across selected SAGE
     labraries. The calculations are based on data stored in a table in
     a databse.

_U_s_a_g_e:

     SAGELyzer(dbArgs, targetSAGE, libs = "*", normalize = "min", tagColName
     = "tag", k = 500, dist = "euclidean", trans = "sqrt")
     getSAGESQL(dbArgs, conn, targetSAGE, libs, tagColName, chunk = FALSE,
     cursor = "sageRows", ignorZeros = TRUE, what = c("map", "counts",
     "info"))
     getTotalRNum(dbArgs, conn, tagColName, what = "counts")
     getKNN(dbArgs, targetSAGE, libs, tagColName, normalize, k,
                      dist, trans, max = 10000)
     noChunkKNN(dbArgs, conn, targetSAGE, libs, tagColName, normalize, k,
     dist, trans)
     chunkKNN(dbArgs, conn, targetSAGE, libs, tagColName, normalize, k, dist,
     trans, rowNum, max = 50000)
     findNeighborTags(targetRow, data, k, NF, dist, trans)
      getColNames(dbArgs, conn, what = "counts")

_A_r_g_u_m_e_n_t_s:

  dbArgs: 'dbArgs' a list containing arguments needed to make
          connection to a database and queries against a table. The
          elements include a DSN under Windows and database name, user
          name, password, and host under Unix plus the names for three
          tables that will be used by SAGElyzer

targetSAGE: 'targetSAGE' a character string for the SAGE tag whose
          neighbors will be sought

    libs: 'libs' a vector of character strings for column names of
          database table where SAGE library data are stored

normalize: 'normalize' a character string for the means to perfrom data
          normalization. Can be either "min", "max", or "none"

tagColName: 'tagColName' a character string for the column name of a
          database table where SAGE tags are stored

       k: 'k' an integer for the number of nearest neighbors to be
          sought

    dist: 'dist' a character string corresponding to an existing R
          object for calculating distances between two data sets

   trans: 'trans' a character string corresponding to an existing R
          object that will be used to transform the data

    conn: 'conn' a connection to a database

   chunk: 'chunk' a boolean indicating whether data will be processed
          in chunks to avoid running out space

ignorZeros: 'ignorZeros' a boolean indicating whether data rows with
          all 0s will be ignored

    what: 'what' a character string for the type of database table to
          use for getting data. Have to be either "map", "counts", or
          "info"

     max: 'max' an integer for the maximum number of data rows in a
          chunk to be processed

  rowNum: 'rowNum' an integer for row number

      NF: 'NF' a vector of numerical data that will be used as
          normalization factor for SAGE counts

targetRow: 'targetRow' a vector of character strings containing data
          for the target SAGE tag

    data: 'data' a matrix containing SAGE counts across selected
          libraries

  cursor: 'cursor' a character string for the name of a cursor to
          reterive data in chunks from a database table

_D_e_t_a_i_l_s:

     Two database tables (default names "sagecounts" and "sageinfo"
     have to exist (tables can be created using other function in this
     package). One table (sagecounts) contains counts for SAGE tags for
     libraries and the other (sageinfo) contains mappings between
     column names used in "sagecounts" to store data for a given SAGE
     library.

     Functions in this package are normally called by interactive
     interfaces that are invoked when the package is loaded.

_V_a_l_u_e:

     'SAGELyzer' returns a named vector with SAGE tags being the names
     and the corresponding calculated distances to a given tag being
     the values.

     'getSAGESQL' returns a character string for a SQL statement to use
     to query a database.

     'getTotalRNum' returns an integer for the total row number of a
     database table.

_N_o_t_e:

     This function is part of the Bioconductor project at Dana-Farber
     Cancer Institute to provide Bioinformatics functionalities through
     R

_A_u_t_h_o_r(_s):

     Jianhua Zhang

_S_e_e _A_l_s_o:

     'SAGE4Unix'

_E_x_a_m_p_l_e_s:

     # No example is given as the code requires data with existing tables

