artMS
is an R package that provides a set of tools for the analysis and
integration of large-scale proteomics (mass-spectrometry-based) datasets
obtained using the popular proteomics software package
MaxQuant.
The functions available at artMS
can be grouped into 4 major categories:
artMS
performs the different analyses taking as input the following files:
artmsWriteConfigYamlFile()
We assume that you have both R and RStudio
already installed on your system. Please, ensure that your system is
running an R version >= 3.5
or otherwise nothing will work (Bioconductor
requirement). You can check the R version currently running on your system
by executing this command in RStudio:
getRversion()
If the outcome is >= 3.5.0
, congratulations! you can move forward
If it is not, then you need to install the latest version of R in your system. After updating to the latest R version,
open RStudio and try again getRversion()
to make sure it worked.
Two options to install artMS
artMS
is available on BioConductor.
install.packages("BiocManager")
BiocManager::install("artMS")
Why Bioconductor? Here you can find a nice summary of good reasons.
(Warning: not stable, but it has the latest)
Assuming that you have an R (>= 3.5)
version running on your system,
follow these steps:
install.packages("devtools")
library(devtools)
install_github("biodavidjm/artMS")
artmsQualityControlEvidenceBasic
:library(artMS)
?artmsQualityControlEvidenceBasic
Once installed, we suggest you to do a quick test by running the quality
control functions using the “evidence” (artms_data_ph_evidence
) and
“keys” (artms_data_ph_keys
) files included in artMS
as test datasets.
library(artMS)
##
##
suppressWarnings(
artmsQualityControlEvidenceBasic(evidence_file = artms_data_ph_evidence,
keys_file = artms_data_ph_keys,
prot_exp = "PH",
plotINTDIST = FALSE,
plotREPRO = TRUE,
plotCORMAT = FALSE,
plotINTMISC = FALSE,
plotPTMSTATS = FALSE,
printPDF = FALSE,
verbose = FALSE))
(To learn more about these testing datasets, check the documentation by
running ?artms_data_ph_keys
or ?artms_data_ph_evidence
on the R console)
Once the QC is done, go to the folder "/path/to/your/working/directory/"
and check out all the generated QC (pdf) files.
Three basic (tab-delimited) files are required to perform the full pack of operations:
evidence.txt
The output of the quantitative proteomics software package MaxQuant. It combines all the information about the identified peptides.
keys.txt
Tab delimited file generated by the user. It summarizes the experimental
design of the evidence file. When using artMS
, the keys.txt
file will
be merged with the evidence.txt
by the “RawFile” column. Each RawFile
corresponds to a unique individual experimental technical replicate /
biological replicate / Condition / Run.
For any basic label-free proteomics experiment, the keys file must contain the following columns:
'L'
for label free experiments ('H'
will be used
for SILAC experiments, see below)_
).Condition
name, and add as suffix a dash (-)
and the
number of biological replicate. For example, if condition H1N1_06H
has
too biological replicates, name them as H1N1_06H-1
and H1N1_06H-2
Example of keys file: check the data object artms_data_ph_keys
RawFile | IsotopeLabelType | Condition | BioReplicate | Run |
---|---|---|---|---|
qx006145 | L | Cal33 | Cal33-1 | 1 |
qx006148 | L | Cal33 | Cal33-4 | 4 |
qx006151 | L | HSC6 | HSC6-2 | 6 |
qx006152 | L | HSC6 | HSC6-3 | 7 |
Tip: it is recommended to use Microsoft Excel (OpenOffice Cal / or similar) to generate the keys file. Do not forget to choose the format = Tab Delimited Text (.txt) when saving the file (use save as option)
contrast.txt
The comparisons between conditions that the user wants to quantify.
For example, to quantify changes in protein abundance between wild type
WT_A549
relative to two additional experimental conditions with drugs
WT_DRUG_A
and WT_DRUG_B
, but also changes in protein abundance between
DRUG_A
and DRUG_B
, the contrast file would look like this:
WT_DRUG_A-WT_A549
WT_DRUG_B-WT_A549
WT_DRUG_A-WT_DRUG_B
Requirements:
-
),
and only one dash symbol is allowed, i.e., only one comparison per line.As a result of the quantification, the condition on the left will take the
positive log2FC sign -if the protein is more abundant in condition
WT_DRUG_A
, and the condition on the right the negative log2FC -if a
protein is more abundant in condition WT_A549
.
.yaml
)The configuration file (in yaml
format) contains the configuration
details for the quantification performed by artMS
using MSstats
.
To generate a sample configuration file, go to the project folder
(setwd(/path/to/your/working/folder/)
) and execute:
artmsWriteConfigYamlFile(config_file_name = "config.yaml", verbose = FALSE)
Open the config.yaml
file with your favorite editor (RStudio works
very well as well). Although it might look complex, the default
options work very well.
The configuration (yaml
) file contains the following sections:
files
files :
evidence : /path/to/the/evidence.txt
keys : /path/to/the/keys.txt
contrasts : /path/to/the/contrast.txt
output : /path/to/the/results_folder/ph-results.txt
The file path/name
of the required files. It is recommended to create
a new folder in your folder project (for example, results_folder
).
The results file name (e.g. -results.txt
) will be used as prefix for the
several files (txt
and pdf
) that will be generated.
qc
qc:
basic: 1 # 1 = yes; 0 = no
extended: 1 # 1 = yes; 0 = no
Select to perform both ‘basic’ and ‘extended’ quality control. Read below to find out more about the details of each type of analysis.
data
data:
enabled : 1 # 1 = yes; 0 = no
fractions:
enabled : 0 # 1 for protein fractionation
silac:
enabled : 0 # 1 for SILAC experiments
filters:
enabled : 1
contaminants : 1
protein_groups : remove # remove, keep
modifications : AB # PH, UB, AB, APMS
sample_plots : 1 # correlation plots
Let’s break it down data
:
enabled
:
- 1
: to pre-process the data provided in the files section.
- 0
: won’t process the data (and a pre-generated MSstats file will
be expected)
fractions
: Multiple fractionation or separation methods are often
combined in proteomics to improve signal-to-noise and proteome coverage
and to reduce interference between peptides in quantitative proteomics.
- enabled : 1
for fractionation dataset. See
Special case: Protein Fractionation below for details
- enabled : 0
no fractions
silac
:
enabled : 1
: check if the files belong to a SILAC experiment.
See Special case: SILAC below for detailsenabled : 0
: it does notfilters
:
enabled : 1
Enables filteringcontaminants : 1
Removes contaminants (CON__
and REV__
labeled by MaxQuant)protein_groups : remove
choose whether remove
or keep
protein groupsmodifications : AB
any of the proteomics experiments, PH
,
UB
, or AC
for posttranslational modifications, AB
or APMS
otherwise.sample_plots
1
Generate correlation plots0
otherwiseMSstats
msstats :
enabled : 1
msstats_input :
profilePlots : none
normalization_method : equalizeMedians
normalization_reference :
summaryMethod : TMP
censoredInt : NA
cutoffCensored : minFeature
MBimpute : 1
feature_subset: all
Let’s break it down:
enabled :
Choose 1
to run MSstats, 0
otherwise.msstats_input :
blank if MSstats is going to be run (enabled : 1
).
But if otherwise (enabled : 0) then provide the path to the previously generated
evidence-mss.txt`profilePlots :
Several profile plots available.
before
plots only before normalizationafter
plots only after normalizationbefore-after
: recommended, although computational expensive (time
consuming)none
no normalization plots (convenient if time limitations)normalization_method :
available options:
equalizeMedians
quantile
0
: no normalization (not recommended)globalStandards
if selected, specified the reference protein in
normalization_reference
(next)normalization_reference :
an UniProt id if globalStandards
is chosen
as the normalization_method
(above)summaryMethod :
TMP # “TMP”(default) means Tukey’s median polish, which
is robust estimation method. “linear” uses linear mixed model. “logOfSum”
conducts log2 (sum of intensities) per run.censoredInt :
NA
Missing values are censored or at random. ‘NA’ (default) assumes
that all ‘NA’s in ’Intensity’ column are censored.0
uses zero intensities as censored intensity. In this case,
NA intensities are missing at random. The output from Skyline should use
0
. Null assumes that all NA
intensities are randomly missing.cutoffCensored :
minFeature
Cutoff value for censoring. Only with censoredInt='NA'
or 0
. Default is ‘minFeature’, which uses minimum value for each feature.minFeatureNRun
uses the smallest between minimum value of corresponding
feature and minimum value of corresponding run.minRun
uses minimum value for each run.MBimpute :
TRUE
only for summaryMethod="TMP"
and censoredInt='NA'
or 0
.
TRUE (default) imputes ‘NA’ or ‘0’ (depending on censoredInt option) by
Accelerated failure model.FALSE
uses the values assigned by cutoffCensored.feature_subset :
all
: defaulthighQuality
: this option seems to be buggy right nowCheck MSstats documentation to find out more about every option.
output_extras
enabled : 1 # if 0, won't process anything on this section
annotate :
enabled: 1 # if 1, will generate a `-results-annotated.txt` file that
including Gene and Protein.Name
species : HUMAN
plots:
volcano: 1
heatmap: 1
LFC : -1.5 1.5 # Range of minimal log2fc
FDR : 0.05
heatmap_cluster_cols : 0
heatmap_display : log2FC # log2FC or pvalue
To handle protein fractionation experiments, two options need to be activated
FractionKey
” with
the information about fractions. For example:Raw.file | IsotopeLabelType | Condition | BioReplicate | Run | FractionKey |
---|---|---|---|---|---|
S9524_Fx1 | L | AB | AB-1 | 1 | 1 |
S9524_Fx2 | L | AB | AB-1 | 1 | 2 |
S9524_Fx3 | L | AB | AB-1 | 1 | 3 |
S9524_Fx4 | L | AB | AB-1 | 1 | 4 |
S9524_Fx5 | L | AB | AB-1 | 1 | 5 |
S9524_Fx6 | L | AB | AB-1 | 1 | 6 |
S9524_Fx7 | L | AB | AB-1 | 1 | 7 |
S9524_Fx8 | L | AB | AB-1 | 1 | 8 |
S9524_Fx9 | L | AB | AB-1 | 1 | 9 |
S9524_Fx10 | L | AB | AB-1 | 1 | 10 |
S9525_Fx1 | L | AB | AB-2 | 2 | 1 |
S9525_Fx2 | L | AB | AB-2 | 2 | 2 |
S9525_Fx3 | L | AB | AB-2 | 2 | 3 |
S9525_Fx4 | L | AB | AB-2 | 2 | 4 |
S9525_Fx5 | L | AB | AB-2 | 2 | 5 |
S9525_Fx6 | L | AB | AB-2 | 2 | 6 |
S9525_Fx7 | L | AB | AB-2 | 2 | 7 |
S9525_Fx8 | L | AB | AB-2 | 2 | 8 |
S9525_Fx9 | L | AB | AB-2 | 2 | 9 |
S9525_Fx10 | L | AB | AB-2 | 2 | 10 |
S9526_Fx1 | L | AB | AB-3 | 3 | 1 |
S9526_Fx2 | L | AB | AB-3 | 3 | 2 |
S9526_Fx3 | L | AB | AB-3 | 3 | 3 |
S9526_Fx4 | L | AB | AB-3 | 3 | 4 |
S9526_Fx5 | L | AB | AB-3 | 3 | 5 |
S9526_Fx6 | L | AB | AB-3 | 3 | 6 |
S9526_Fx7 | L | AB | AB-3 | 3 | 7 |
S9526_Fx8 | L | AB | AB-3 | 3 | 8 |
S9526_Fx9 | L | AB | AB-3 | 3 | 9 |
S9526_Fx10 | L | AB | AB-3 | 3 | 10 |
fractions:
enabled : 1 # 1 for protein fractions, 0 otherwise
One of the most widely used techniques that enable relative protein quantification is stable isotope labeling by amino acids in cell culture (SILAC). The keys file can capture the typical SILAC experiment. The following example shows a SILAC experiment with two conditions, two biological replicates, and two technical replicates:
RawFile | IsotopeLabelType | Condition | BioReplicate | Run |
---|---|---|---|---|
QE20140321-01 | H | iso | iso-1 | 1 |
QE20140321-02 | H | iso | iso-1 | 2 |
QE20140321-04 | L | iso | iso-2 | 3 |
QE20140321-05 | L | iso | iso-2 | 4 |
QE20140321-01 | L | iso_M | iso_M-1 | 1 |
QE20140321-02 | L | iso_M | iso_M-1 | 2 |
QE20140321-04 | H | iso_M | iso_M-2 | 3 |
QE20140321-05 | H | iso_M | iso_M-2 | 4 |
It is also required to activate the silac option in the yaml file as follows:
silac:
enabled : 1 # 1 for SILAC experiments
Three functions are available to perform QC analyses. For illustrative
purposes, an example dataset consisting of a reduced version of two head
and neck cancer cell lines (conditions "Cal33"
and "HSC6"
), 2 biological
replicates each. The number of peptides was reduced to 1/5 due to bioconductor
limitations on data size.
artms_data_ph_evidence
artms_data_ph_keys
The full data set (2 conditions, 4 biological replicates) can be found at the following urls:
url_evidence <- 'http://kroganlab.ucsf.edu/artms/ph/evidence.txt'
url_keys <- 'http://kroganlab.ucsf.edu/artms/ph/evidence.txt'
evidence.txt
-based)The basic quality control analysis takes as input both the evidence.txt
and keys.txt
.
# But it is recommended to get the full pdf package of QC plots by running:
# artmsQualityControlEvidenceBasic(evidence_file = artms_data_ph_evidence,
# keys_file = artms_data_ph_keys,
# prot_exp = "PH")
# But for illustration purposes printing only INTDIST plot:
artmsQualityControlEvidenceBasic(evidence_file = artms_data_ph_evidence,
keys_file = artms_data_ph_keys,
prot_exp = "PH",
plotINTDIST = TRUE,
plotREPRO = FALSE,
plotCORMAT = FALSE,
plotINTMISC = FALSE,
plotPTMSTATS = FALSE,
printPDF = FALSE,
verbose = FALSE)
Running artmsQualityControlEvidenceBasic()
generates the following pdf
files:
CON
: contaminants, PROT
peptides,
REV
reversed sequences used by MaxQuant to estimate the FDR); Box plots
of MS Intensity values per biological replicates and conditions; bar plots
of total intensity (excluding contaminants) by bioreplicates and conditions;
Barplots of total feature counts by bioreplicates and conditions.PH
, UB
, AC
) an extra pdf
file will be generated with stats related to the selected modification,
including: bar plot of peptide counts and intensities, broken by
PTM/other
categories; bar plots of total sum-up of MS intensity values by
other/PTM categories.Check ?artmsQualityControlEvidenceBasic()
to find out more options
about this function.
evidence.txt
-based)It takes as input the evidence.txt
and keys.txt
files as follows:
artmsQualityControlEvidenceExtended(evidence_file = artms_data_ph_evidence,
keys_file = artms_data_ph_keys)
It generates the following QC files:
summary.txt
based)It requires two files:
keys.txt
summary.txt
file. As described by MaxQuant’s table.pdf
, the
summary file contains summary information for all the raw files processed
with a single MaxQuant run, including statistics on the peak detection. artmsQualityControlSummaryExtended()
gathers a quick overview on the
quality of every RawFile based on this summary.txt
.Run it as follows:
artmsQualityControlSummaryExtended(summary_file = "summary.txt",
keys_file = artms_data_ph_keys)
It generates the following pdf
plots:
plotMS1SCANS: generates MS1 scan counts plot:
Page 1 shows the number of MS1 scans in each BioReplicate.
If replicates are present, Page 2 shows the mean number of MS1 scans
per condition with error bar showing the standard error of the mean.
If isFractions
is TRUE
, each fraction is a stack on the individual
bar graphs.
plotMS2: generates MS2 scan counts plot:
Page 1 shows the number of MSs scans in each BioReplicate.
If replicates are present, Page 2 shows the mean number of MS1 scans per
condition with error bar showing the standard error of the mean.
If isFractions
TRUE
, each fraction is a stack on the individual bar graphs.
plotMSMS: generates MS2 identification rate (%) plot:
Page 1 shows the fraction of MS2 scans confidently identified in each
BioReplicate. If replicates are present, Page 2 shows the mean rate of MS2
scans confidently identified per condition with error bar showing the
standard error of the mean.
If isFractions
TRUE
, each fraction is a stack on the individual bar graphs.
plotISOTOPE: generates Isotope Pattern counts plot:
Page 1 shows the number of Isotope Patterns with charge greater than 1 in
each BioReplicate. If replicates are present, Page 2 shows the mean number
of Isotope Patterns with charge greater than 1 per condition with error bar
showing the standard error of the mean.
If isFractions
TRUE
, each fraction is a stack on the individual bar graphs.
The relative quantification is the core of this package. All the information
required to run a relative quantification analysis using MSstats
is
provided through a configuration file (.yaml
format). Check the above to
find out more about the different sections of the configuration file.
A template of the configuration file can be generated by running
artmsWriteConfigYamlFile()
.
Different types of proteomics experiments can be analyzed such as protein abundance (ab), affinity purification mass spectrometry (apms), and different type of posttranslational modifications, including phosphorylation (ph), ubiquitination (ub), and acetylation (ac)
It quantifies changes in protein abundance between two different conditions. These are the specific sections that the user has to filled up:
files:
evidence : /path/to/the/evidence.txt
keys : /path/to/the/keys.txt
contrasts : /path/to/the/contrast.txt
output : /path/to/the/output/results_ptmGlobal/results.txt
.
.
.
data:
.
.
.
filters:
modifications : AB
Make sure that the filter modifications
is labeled as AB
.
Finally, run the following artMS
function:
artmsQuantification(
yaml_config_file = '/path/to/config/file/artms_ab_config.yaml')
The global phosphorylation / ubiquitination quantification analysis calculates changes in phosphorylation/ubiquitination at the protein level. This means that all the modified peptides are used to quantify changes in protein phosphorylation/ubiquitination between different conditions. The site-specific (explained next) quantifies changes at the peptide level, i.e., each modified peptide independently between the different conditions.
Only two sections need to be filled up on the default configuration
(yaml
) file:
files:
evidence : /path/to/the/evidence.txt
keys : /path/to/the/keys.txt
contrasts : /path/to/the/contrast.txt
output : /path/to/the/output/results_ptmGlobal/results.txt
.
.
.
data:
.
.
.
filters:
modifications : PH # Use UB for ubiquination
The remaining options can be left unmodified.
Once the configuration yaml
file is ready, run the following command:
artmsQuantification(
yaml_config_file = '/path/to/config/file/artms_phglobal_config.yaml')
The site-specific
analysis quantifies changes at the modified peptide level.
This means that changes in every modified (ph/ub) peptide of a given protein
will be quantified individually. The caveat is that the proportion of missing
values should increase relative to the global analysis. Both site and
global ptm analysis are highly correlated due to the fact that only one
or two peptides drive the overall changes in PTMs for every protein.
To run a site specific analysis follow these steps:
For phosphorylation
artmsProtein2SiteConversion(
evidence_file = "/path/to/the/evidence.txt",
ref_proteome_file = "/path/to/the/reference_proteome.fasta",
output_file = "/path/to/the/output/ph-sites-evidence.txt",
mod_type = "PH")
For ubiquitination
artmsProtein2SiteConversion(
evidence_file = "/path/to/the/evidence.txt",
ref_proteome_file = "/path/to/the/reference_proteome.fasta",
output_file = "/path/to/the/output/ub-sites-evidence.txt",
mod_type = "UB")
phsites_config.yaml
or
ubsites_config.yaml
) as explained above, but using the “new” ph-sites-evidence.txt
/ub-sites-evidence.txt
file instead of the original
evidence.txt
file. Only two sections need to be filled up on the default
configuration (yaml
) file:files:
evidence : /path/to/the/evidence-site.txt
keys : /path/to/the/keys.txt
contrasts : /path/to/the/contrast.txt
output : /path/to/the/output/results_ptmSITES/sites-results.txt
.
.
.
data:
.
.
.
filters:
modifications : PH # Use UB for ubiquination
Once the new yaml
file has been created, execute:
artmsQuantification(
yaml_config_file = '/path/to/config/file/phsites_config.yaml')
Comprehensive analysis of the quantification obtained running
artmsQuantification()
. It includes:
It takes as input two files generated from the previous quantification step
(artmsQuantification()
)
-results.txt
: MSstats quantification results-results_ModelQC.txt
: MSstats normalized abundance valuesTo run this analysis
artmsQuantification()
.setwd('~/path/to/the/results_quantification/')
And then run the following function (for an “AB” experiment)
artmsAnalysisQuantifications(log2fc_file = "ab-results.txt",
modelqc_file = "ab-results_ModelQC.txt",
species = "human",
output_dir = "AnalysisQuantifications")
A few comments on the available options for artmsAnalysisQuantifications
:
isPTM
. For both protein abundance (AB
), Affinity Purification-Mass
Spectrometry (APMS
), and global analysis of posttranslational modifications
(PH
and UB
) analyses use the option "noptm"
. For a site specific
PTM analysis use "ptmsites"
.species
. This downstream analysis supports (for now) "human"
and
"mouse"
enrich
. If TRUE
, it will perform enrichment analysis using gProfileR
isBackground
. If enrich = TRUE
, the user can provide a background
gene list (add the file path as well)mnbr
: Minimal Number of Biological Replicates for imputation. Missing
values will be imputed and this argument is set to specified the minimal
number of biological replicates that are required in at least one of the
conditions, but for all the proteins For example, mnbr = 2
would mean
that only proteins found in at least two biological replicates will be
imputed. In addition, for any other protein should be identified in at least
one condition in two biological replicates or it will be removed. That is,
if mnbr = 2
, if a protein was found in two conditions but only in one
biological replicate (in both conditions), it will be remove.l2fc_thres
is the log2fc cutoff for enrichment analysis, absolute
value, i.e., if it is set to 1, it will consider significant log2fc> +1
and log2fc < -1.ipval
: specify whether pvalue
or adjpvalue
should use for the analysis.
The default option is adjpvalue
(multiple testing correction).
But if the number of biological replicates for a given experiment is too
low (for example n = 2), then pvalue
is recommended.artMS
also provides a number of very handy functions.
Annotate gene name and symbol based on UniProt ids. It will take the column from your data.frame specified by the columnid argument, search for the gene symbol, name, and entrez based on the species (species argument) and merge the information back to the input data.frame
# This example adds annotations to the evidence file available in
# artMS, based on the column 'Proteins'.
evidence_anno <- artmsAnnotationUniprot(x = artms_data_ph_evidence,
columnid = 'Proteins',
species = 'human')
Taking as input the evidence file location, it will summarize and report back
the average intensity, average retention time, and the average caliberated
retention time for each protein. If a list of proteins is provided, then only
those proteins will be summarized and returned. Check ?artmsAvgIntensityRT()
to find out more options.
artmsAvgIntensityRT(evidence_file = '/path/to/the/evidence.txt)
Change a specific column name in a given data.frame
artms_data_ph_evidence <- artmsChangeColumnName(
dataset = artms_data_ph_evidence,
oldname = "Phospho..STY.",
newname = "PH_STY")
Protein abundance dot plots for each unique uniprot id. It can take a long time
artmsDataPlots(input_file = "results/ab-results-mss-normalized.txt",
output_file = "results/ab-results-mss-normalized.pdf")
Enrichment analysis based on a data.frame with Gene
and Comparison
/Label
protein (i.e, typical MSstats results)
# The data must be annotated (Protein and Gene columns)
data_annotated <- artmsAnnotationUniprot(
x = artms_data_ph_msstats_results,
columnid = "Protein",
species = "human")
# And then the enrichment
enrich_set <- artmsEnrichLog2fc(
dataset = data_annotated,
species = "human",
background = unique(data_annotated$Gene), verbose = FALSE)
## --- No significant results from the enrichment analysis
Function that simplifies enrichment analysis using gProfileR
# annotate the MSstats results to get the Gene name
data_annotated <- artmsAnnotationUniprot(
x = artms_data_ph_msstats_results,
columnid = "Protein",
species = "human")
# Filter the list of genes with a log2fc > 2
filtered_data <-
unique(data_annotated$Gene[which(data_annotated$log2FC > 2)])
# And perform enrichment analysis
data_annotated_enrich <- artmsEnrichProfiler(
x = filtered_data,
categorySource = c('KEGG'),
species = "hsapiens",
background = unique(data_annotated$Gene))
## ---+ Enrichment analysis using gProfiler...done!
Converts the MaxQuant evidence file to the 3 required files by SAINTexpress. Choose one of the following quantitative MS metrics:
artmsEvidenceToSaintExpress(evidence_file = "/path/to/evidence.txt",
keys_file = "/path/to/keys.txt",
ref_proteome_file = "/path/to/org.proteome.fasta")
Converts the MaxQuant evidence file to the required files by SAINTq. The user can filter
based on either peptides with spectral counts (use msspc
) or all the peptides
(use all
) for the analysis. The quantitative metric can be also chosen
(either MS intensity or spectral counts)
artmsEvidenceToSAINTq(evidence_file = "/path/to/evidence.txt",
keys_file = "/path/to/keys.txt",
output_dir = "saintq_input_files")
It generates the Phosfate input file from the imputedL2fcExtended.txt
file
resulting from running the artmsAnalysisQuantifications()
on a ph-site
quantification (see above). Notice that the only species suported by PHOTON
is humans.
artmsPhosfateOutput(inputFile = "your-imputedL2fcExtended.txt")
It generates the Photon input file from the imputedL2fcExtended.txt
file
resulting from running the artmsAnalysisQuantifications()
on a ph-site
quantification (see above). Please, notice that the only species suported by
PHOTON is humans.
artmsPhotonOutput(inputFile = "your-imputedL2fcExtended.txt")
Remove contaminants and erroneously identified ‘reverse’ sequences by MaxQuant, in addition to empty protein ids
evidencefiltered <- artmsFilterEvidenceContaminants(x = artms_data_ph_evidence)
Generate extended detailed ph-site file, where every line is a ph site instead of a peptide. Therefore, if one peptide has multiple ph sites it will be breaking down in multiple extra lines for each of the sites.
artmsGeneratePhSiteExtended(df = dfobject,
species = "mouse",
ptmType = "ptmsites",
output_name = log2fc_file)
artMS
enables the relative quantification of untargeted polar metabolites
using the alignment table generated by Markview.
MarkerView is an ABSciex software that supports the files
generated by Analyst software (.wiff
) used to run our specific mass
spectrometer (ABSciex Triple TOF 5600+).
It also supports .t2d
files generated by the
Applied Biosystems 4700/4800 MALDI-TOF.
MarkerView software is used to align mass spectrometry data from several
samples for comparison. Using the import feature in the software, .wiff
files
(also .t2d
MALDI-TOF files and tab-delimited .txt
mass spectra data
in mass-intensity format) are loaded for retention time alignment.
Once the data files are selected, a series of windows will appear wherein
peak finding, alignment, and filtering options can be entered and selected.
These options include minimum spectral peak width, minimum retention time
peak width, retention time and mass tolerance, and the ability to filter
out peaks that do not appear in more than a user selected number of samples.
The alignment file is further processed and formatted to perform QC
and relative quantification using the following artMS
functions:
Pre-process the markview .txt
file to generate
an “evidence-like” file by running:
artmsConvertMetabolomics(input_file = "markview-output.txt",
out_file = "metabolomics-evidence.txt")
Perform quality control analysis on the metabolomics data by running:
artmsQualityControlMetabolomics(evidence_file = "metabolomics-evidence.txt",
keys_file = "metabolomics-keys.txt")
It generates the following plots:
The relative quantification is performed using
MSstats
. It requires a configuration file (yaml
format, please check above).
A template can be generated by running:
artmsWriteConfigYamlFile(config_file_name = "metab_config.yaml")
.
The relative quantification is performed by running:
artmsQuantification(yaml_config_file = "metabConfig.yaml")
The artMS package provides the following testing datasets:
artms_data_ph_evidence
artms_data_ph_keys
artms_data_ph_msstats_results
artms_data_corum_mito_database
artms_data_pathogen_LPN
artms_data_pathogen_TB
Check the individual help pages (e.g, ?artms_data_ph_evidence
) to find out
more about them.