XStringSet-class         package:Biostrings         R Documentation

_B_S_t_r_i_n_g_S_e_t, _D_N_A_S_t_r_i_n_g_S_e_t, _R_N_A_S_t_r_i_n_g_S_e_t _a_n_d _A_A_S_t_r_i_n_g_S_e_t _o_b_j_e_c_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     The BStringSet class is a container for storing a set of 'BString'
     objects and for making its manipulation easy and efficient.

     Similarly, the DNAStringSet (or RNAStringSet, or AAStringSet)
     class is a container for storing a set of 'DNAString' (or
     'RNAString', or 'AAString') objects.

     All those containers derive directly (and with no additional
     slots) from the XStringSet virtual class. They are also said to be
     XStringSet subtypes.

_U_s_a_g_e:

       ## Constructors:
       BStringSet(x, start=NA, end=NA, width=NA, use.names=TRUE)
       DNAStringSet(x, start=NA, end=NA, width=NA, use.names=TRUE)
       RNAStringSet(x, start=NA, end=NA, width=NA, use.names=TRUE)
       AAStringSet(x, start=NA, end=NA, width=NA, use.names=TRUE)

_A_r_g_u_m_e_n_t_s:

       x: Either a character vector, or an XString, XStringSet or
          XStringViews object. 

   start: Either 'NA', a single integer, or an integer vector of the
          same length as 'x' specifying how 'x' should be "narrowed"
          (see '?narrow' for the details). 

     end: Either 'NA', a single integer, or an integer vector of the
          same length as 'x' specifying how 'x' should be "narrowed"
          (see '?narrow' for the details). 

   width: Either 'NA', a single integer, or an integer vector of the
          same length as 'x' specifying how 'x' should be "narrowed"
          (see '?narrow' for the details). 

use.names: 'TRUE' or 'FALSE'. Should names be preserved? 

_D_e_t_a_i_l_s:

     The 'BStringSet', 'DNAStringSet', 'RNAStringSet' and 'AAStringSet'
     functions are constructors that can be used to "naturally" turn
     'x' into an XStringSet object of the desired subtype.

     They also allow the user to "narrow" the sequences contained in
     'x' via proper use of the 'start', 'end' and/or 'width' arguments.
     In this context, "narrowing" means dropping unwanted parts of 'x'
     located at the beginning (prefix) or end (suffix) of each sequence
     in 'x'.

     The 'narrow' function is a generic function (defined in the
     IRanges package) with a method for narrowing IRanges objects.
     Because XStringSet objects are a particular kind of IRanges
     objects (the XStringSet class is a subclass of the IRanges class),
     an XStringSet object 'y' can be narrowed with 'narrow(y)'.
     Therefore the two following expressions are equivalent:

      'DNAStringSet(x, start=s, end=e, width=w)'

      'narrow(DNAStringSet(x), start=s, end=e, width=w)'


     but, besides being more convenient, the former is also more memory
     efficient on character vectors and would work even if the dropped
     parts contained letters that are not in the DNA alphabet (see
     '?DNA_ALPHABET').

_A_c_c_e_s_o_r _m_e_t_h_o_d_s:

     The XStringSet class derives from the IRanges class hence all the
     accessor methods defined for a IRanges object can also be used on
     an XStringSet object. In particular, the following methods are
     available (in the code snippets below, 'x' is an XStringSet
     object:


      'length(x)': The number of sequences in 'x'.

      'width(x)': A vector of non-negative integers containing the
          number of letters for each element in 'x'.

      'nchar(x)': The same as 'width(x)'.

      'names(x)': 'NULL' or a character vector of the same length as
          'x' containing a short user-provided description or comment
          for each element in 'x'. These are the only data in an
          XStringSet object that can safely be changed by the user. All
          the other data are immutable! As a general recommendation,
          the user should never try to modify an object by accessing
          its slots directly.


_S_u_b_s_e_t_t_i_n_g _a_n_d _a_p_p_e_n_d_i_n_g:

     In the code snippets below, 'x' and 'values' are XStringSet
     objects, and 'i' should be an index specifying the elements to
     extract.


      'x[i]': Return a new XStringSet object made of the selected
          elements.

      'x[[i]]': Extract the i-th 'XString' object from 'x'.

      'append(x, values, after=length(x))': Add sequences in 'values'
          to 'x'.


_O_t_h_e_r _m_e_t_h_o_d_s:

     In the code snippets below, 'x' is an XStringSet object.


      'as.character(x, use.names)': Convert 'x' to a character vector
          of the same length as 'x'. 'use.names' controls whether or
          not 'names(x)' should be used to set the names of the
          returned vector (default is 'TRUE').

      'as.matrix(x, use.names)': Return a character matrix containing
          the "exploded" representation of the strings. This can only
          be used on an XStringSet object with equal-width strings.
          'use.names' controls whether or not 'names(x)' should be used
          to set the row names of the returned matrix (default is
          'TRUE').

      'toString(x)': Equivalent to 'toString(as.character(x))'.


_O_r_d_e_r_i_n_g _a_n_d _r_e_l_a_t_e_d _m_e_t_h_o_d_s:

     In the code snippets below, 'x' is an XStringSet object.


      'order(x)': Return a permutation which rearranges 'x' into
          ascending or descending order.

      'sort(x)': Sort 'x' into ascending order (equivalent to
          'x[order(x)]').


_A_u_t_h_o_r(_s):

     H. Pages

_S_e_e _A_l_s_o:

     BString-class, DNAString-class, RNAString-class, AAString-class,
     XStringViews-class, 'narrow', 'DNA_ALPHABET'

_E_x_a_m_p_l_e_s:

       x0 <- c("#TTGA", "#-CTC-N")
       x1 <- DNAStringSet(x0, start=2)
       x1
       names(x1)
       names(x1)[2] <- "seqB"
       x1

       library(drosophila2probe)
       x2 <- DNAStringSet(drosophila2probe$sequence)
       x2

       RNAStringSet(x2, start=2, end=-5)  # does NOT copy the sequence data!

