BiodbMassdbConn-class {biodb} | R Documentation |
All Mass spectra databases inherit from this class. It thus defines methods specific to mass spectrometry.
collapseResultsDataFrame(results.df, mz.col = "mz", rt.col = "rt", sep = "|")
:
Collapse rows of a results data frame, by outputing a data frame with only one row for each MZ/RT value.
results.df: Results data frame.
mz.col: The name of the M/Z column in the results data frame.
rt.col: The name of the RT column in the results data frame.
sep: The separator used to concatenate values, when collapsing results data frame.
Returned value: A data frame with rows collapsed.
filterEntriesOnRt(
entry.ids,
rt,
rt.unit,
rt.tol,
rt.tol.exp,
chrom.col.ids,
match.rt
)
:
Filters a list of entries on retention time values.
entry.ids: A character vector of entry IDs.
rt: A vector of retention times to match. Used if input.df is not set. Unit is specified by rt.unit parameter.
rt.unit: The unit for submitted retention times. Either 's' or 'min'.
rt.tol: The plain tolerance (in seconds) for retention times: input.rt - rt.tol <= database.rt <= input.rt + rt.tol.
rt.tol.exp: A special exponent tolerance for retention times: input.rt - input.rt ** rt.tol.exp <= database.rt <= input.rt + input.rt ** rt.tol.exp. This exponent is applied on the RT value in seconds. If both rt.tol and rt.tol.exp are set, the inequality expression becomes: input.rt - rt.tol - input.rt ** rt.tol.exp <= database.rt <= input.rt + rt.tol + input.rt ** rt.tol.exp.
chrom.col.ids: IDs of chromatographic columns on which to match the retention time.
match.rt: If set to TRUE, filters on RT values, otherwise does not do any filtering.
Returned value: A character vector containing entry IDs after filtering.
getChromCol(ids = NULL)
:
Gets a list of chromatographic columns contained in this database.
ids: A character vector of entry identifiers (i.e.: accession numbers). Used to restrict the set of entries on which to run the algorithm.
Returned value : A data.frame with two columns, one for the ID 'id' and another one for the title 'title'.
getMatchingMzField()
:
Gets the field to use for M/Z matching.
Returned value: The name of the field (one of peak.mztheo or peak.mzexp).
getMzValues(ms.mode = NULL, max.results = 0, precursor = FALSE, ms.level = 0)
:
Gets a list of M/Z values contained inside the database.
ms.mode: The MS mode. Set it to either 'neg' or 'pos' to limit the output to one mode.
max.results: If set, it is used to limit the size of the output.
precursor: If set to TRUE, then restrict the search to precursor peaks.
ms.level: The MS level to which you want to restrict your search. 0 means that you want to search in all levels.
Returned value: A numeric vector containing M/Z values.
getNbPeaks(mode = NULL, ids = NULL)
:
Gets the number of peaks contained in the database.
mode: The MS mode. Set it to either 'neg' or 'pos' to limit the counting to one mode.
ids: A character vector of entry identifiers (i.e.: accession numbers). Used to restrict the set of entries on which to run the algorithm.
Returned value: The number of peaks, as an integer.
msmsSearch(
spectrum,
precursor.mz,
mz.tol,
mz.tol.unit = c("plain", "ppm"),
ms.mode,
npmin = 2,
dist.fun = c("wcosine", "cosine", "pkernel", "pbachtttarya"),
msms.mz.tol = 3,
msms.mz.tol.min = 0.005,
max.results = 0
)
:
Searches MSMS spectra matching a template spectrum. The mz.tol parameter is applied on the precursor search.
spectrum: A template spectrum to match inside the database.
precursor.mz: The M/Z value of the precursor peak of the mass spectrum.
mz.tol: The M/Z tolerance, whose unit is defined by mz.tol.unit.
mz.tol.unit: The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.
ms.mode: The MS mode. Set it to either 'neg' or 'pos'.
npmin: The minimum number of peak to detect a match (2 is recommended).
dist.fun: The distance function used to compute the distance betweem two mass spectra.
msms.mz.tol: M/Z tolerance to apply while matching MSMS spectra. In PPM.
msms.mz.tol.min: Minimum of the M/Z tolerance (plain unit). If the M/Z tolerance computed with 'msms.mz.tol' is lower than 'msms.mz.tol.min', then 'msms.mz.tol.min' will be used.
max.results: If set, it is used to limit the number of matches found for each M/Z value.
Returned value: A data frame with columns 'id', 'score' and 'peak.*'. Each 'peak.*' column corresponds to a peak in the input spectrum, in the same order and gives the number of the peak that was matched with it inside the matched spectrum whose ID is inside the 'id' column.
searchForMassSpectra(
mz.min = NULL,
mz.max = NULL,
mz = NULL,
mz.tol = NULL,
mz.tol.unit = c("plain", "ppm"),
rt = NULL,
rt.unit = c("s", "min"),
rt.tol = NULL,
rt.tol.exp = NULL,
chrom.col.ids = NULL,
precursor = FALSE,
min.rel.int = 0,
ms.mode = NULL,
max.results = 0,
ms.level = 0
)
:
Searches for entries (i.e.: spectra) that contain a peak around the given M/Z value. Entries can also be filtered on RT values. You can input either a list of M/Z values through mz argument and set a tolerance with mz.tol argument, or two lists of minimum and maximum M/Z values through mz.min and mz.max arguments.
mz: A vector of M/Z values.
mz.tol: The M/Z tolerance, whose unit is defined by mz.tol.unit.
mz.tol.unit: The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.
mz.min: A vector of minimum M/Z values.
mz.max: A vector of maximum M/Z values. Its length must be the same as 'mz.min'.
rt: A vector of retention times to match. Used if input.df is not set. Unit is specified by rt.unit parameter.
rt.unit: The unit for submitted retention times. Either 's' or 'min'.
rt.tol: The plain tolerance (in seconds) for retention times: input.rt - rt.tol <= database.rt <= input.rt + rt.tol.
rt.tol.exp: A special exponent tolerance for retention times: input.rt - input.rt ** rt.tol.exp <= database.rt <= input.rt + input.rt ** rt.tol.exp. This exponent is applied on the RT value in seconds. If both rt.tol and rt.tol.exp are set, the inequality expression becomes: input.rt - rt.tol - input.rt ** rt.tol.exp <= database.rt <= input.rt + rt.tol + input.rt ** rt.tol.exp.
chrom.col.ids: IDs of chromatographic columns on which to match the retention time.
precursor: If set to TRUE, then restrict the search to precursor peaks.
min.rel.int: The minimum relative intensity, in percentage (i.e.: float number between 0 and 100).
ms.mode: The MS mode. Set it to either 'neg' or 'pos'.
ms.level: The MS level to which you want to restrict your search. 0 means that you want to search in all levels.
max.results: If set, it is used to limit the number of matches found for each M/Z value.
Returned value: A character vector of spectra IDs.
searchMsEntries(
mz.min = NULL,
mz.max = NULL,
mz = NULL,
mz.tol = NULL,
mz.tol.unit = c("plain", "ppm"),
rt = NULL,
rt.unit = c("s", "min"),
rt.tol = NULL,
rt.tol.exp = NULL,
chrom.col.ids = NULL,
precursor = FALSE,
min.rel.int = 0,
ms.mode = NULL,
max.results = 0,
ms.level = 0
)
:
This method is deprecated.
Use searchForMassSpectra() instead.
searchMsPeaks(
input.df = NULL,
mz = NULL,
mz.tol = NULL,
mz.tol.unit = c("plain", "ppm"),
min.rel.int = 0,
ms.mode = NULL,
ms.level = 0,
max.results = 0,
chrom.col.ids = NULL,
rt = NULL,
rt.unit = c("s", "min"),
rt.tol = NULL,
rt.tol.exp = NULL,
precursor = FALSE,
precursor.rt.tol = NULL,
insert.input.values = TRUE,
prefix = NULL,
compute = TRUE,
fields = NULL,
fieldsLimit = 0,
input.df.colnames = c(mz = "mz", rt = "rt"),
match.rt = FALSE
)
:
For each M/Z value, searches for matching MS spectra and returns the matching peaks.
input.df: A data frame taken as input for searchMsPeaks(). It must contain a columns 'mz', and optionaly an 'rt' column.
mz: A vector of M/Z values to match. Used if input.df is not set.
mz.tol: The M/Z tolerance, whose unit is defined by mz.tol.unit.
mz.tol.unit: The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.
min.rel.int: The minimum relative intensity, in percentage (i.e.: float number between 0 and 100).
ms.mode: The MS mode. Set it to either 'neg' or 'pos'.
ms.level: The MS level to which you want to restrict your search. 0 means that you want to search in all levels.
max.results: If set, it is used to limit the number of matches found for each M/Z value.
chrom.col.ids: IDs of chromatographic columns on which to match the retention time.
rt: A vector of retention times to match. Used if input.df is not set. Unit is specified by rt.unit parameter.
rt.unit: The unit for submitted retention times. Either 's' or 'min'.
rt.tol: The plain tolerance (in seconds) for retention times: input.rt - rt.tol <= database.rt <= input.rt + rt.tol.
rt.tol.exp: A special exponent tolerance for retention times: input.rt - input.rt ** rt.tol.exp <= database.rt <= input.rt + input.rt ** rt.tol.exp. This exponent is applied on the RT value in seconds. If both rt.tol and rt.tol.exp are set, the inequality expression becomes: input.rt - rt.tol - input.rt ** rt.tol.exp <= database.rt <= input.rt + rt.tol + input.rt ** rt.tol.exp.
precursor: If set to TRUE, then restrict the search to precursor peaks.
precursor.rt.tol: The RT tolerance used when matching the precursor.
insert.input.values: Insert input values at the beginning of the result data frame.
prefix: Add prefix on column names of result data frame.
compute: If set to TRUE, use the computed values when converting found entries to data frame.
fields: A character vector of field names to output. The data frame output will be restricted to this list of fields.
fieldsLimit: The maximum of values to output for fields with multiple values. Set it to 0 to get all values.
input.df.colnames: Names of the columns in the input data frame.
match.rt: If set to TRUE, match also RT values.
Returned value: A data frame with at least input MZ and RT columns, and annotation columns prefixed with 'prefix' if set. For each matching found a row is output. Thus if n matchings are found for M/Z value x, then there will be n rows for x, each for a different match. The number of matching found for each M/Z value is limited to 'max.results'.
searchMzRange(
mz.min,
mz.max,
min.rel.int = 0,
ms.mode = NULL,
max.results = 0,
precursor = FALSE,
ms.level = 0
)
Find spectra in the given M/Z range. Returns a list of spectra IDs.
searchMzTol(
mz,
mz.tol,
mz.tol.unit = "plain",
min.rel.int = 0,
ms.mode = NULL,
max.results = 0,
precursor = FALSE,
ms.level = 0
)
Find spectra containg a peak around the given M/Z value. Returns a character vector of spectra IDs.
setMatchingMzField(field = c("peak.mztheo", "peak.mzexp"))
:
Sets the field to use for M/Z matching.
field: The field to use for matching.
Returned value: None.
Super class BiodbConn
.
# Create an instance with default settings: mybiodb <- biodb::newInst() # Get connector conn <- mybiodb$getFactory()$createConn('mass.csv.file') # Terminate instance. mybiodb$terminate()