XStringAlign-class        package:Biostrings        R Documentation

_X_S_t_r_i_n_g_A_l_i_g_n _o_b_j_e_c_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     The 'XStringAlign' class is a container for storing an alignment
     between 2 XString objects of the same subtype.

_D_e_t_a_i_l_s:

     Before we define the notion of alignment, we introduce the notion
     of "filled-with-gaps supersequence". A "filled-with-gaps
     supersequence" of a string s1 is a string S1 that is obtained by
     inserting 0 or any number of gaps in s1. For example L-A-ND is a
     "filled-with-gaps supersequence" of LAND. An alignment between 2
     strings s1 and s2 is made of 2 strings (align1 and align2) that
     are "filled-with-gaps supersequences" of s1 and s2, and that have
     the same length. Note that this common length must be greater or
     equal to the lengths of s1 and s2: nchar(align1) = nchar(align2)
     >= max(nchar(s1), nchar(s2))

     For example, this is an alignment between LAND and LEAVES:


         L-A--ND
         LEAVES-

     An alignment can be seen as a compact representation of one set of
     basic operations that transforms s1 into s2. There are 3 different
     kinds of basic operations: "insertions" (gaps in align1),
     "deletions" (gaps in align2),  "replacements". The above alignment
     represents the following basic operations:


         insert E at pos 2
         insert V at pos 4
         insert E at pos 5
         replace by S at pos 6 (N is replaced by S)
         delete at pos 7 (D is deleted)

     Note that "insert X at pos i" means that all letters at a position
     >= i are moved 1 place to the right before X is actually inserted.

     There are many possible alignments between 2 given strings s1 and
     s2 and a common problem is to find the one (or those ones) with
     the highest score i.e. with the lower total cost in terms of basic
     operations.

_A_c_c_e_s_o_r _m_e_t_h_o_d_s:

     In the code snippets below, 'x' is a 'XStringAlign' object.


      'align1(x)' and 'align2(x)': The "filled-with-gaps
          supersequences" of the original strings to align. Note that
          'align1(x)' and 'align2(x)' are XString objects of the same
          subtype and length.

      'type(x)': The type of the alignment ('"global"', '"local"', or
          '"overlap"').

      'score(x)': The score of the alignment (integer).

      'length(x)' or 'nchar(x)': The length of the alignment i.e. the
          common length of 'align1(x)' and 'align2(x)'.

      'alphabet(x)': Equivalent to 'alphabet(align1(x))' (or
          'alphabet(align2(x))').

      'as.character(x)': Converts 'x' to a named character vector of
          length 2.


_A_u_t_h_o_r(_s):

     H. Pages

_S_e_e _A_l_s_o:

     'pairwiseAlignment', XString-class

_E_x_a_m_p_l_e_s:

       s1 <- AAString("LAND")
       s2 <- AAString("LEAVES")
       nw1 <- pairwiseAlignment(s1, s2, substitutionMatrix = "BLOSUM50", gapOpening = -3, gapExtension = -1)
       nw1
       length(nw1)
       nw0 <- pairwiseAlignment(s1, s2, substitutionMatrix = "BLOSUM50", gapOpening = 0, gapExtension = 0)
       nw0
       length(nw0)
       ## Low gap penalties tend to produce longer alignments!

       as.character(nw0)

