calculate_features {PrInCE}R Documentation

Calculate the default features used to predict interactions in PrInCE

Description

Calculate the six features that are used to discriminate interacting and non-interacting protein pairs based on co-elution profiles in PrInCE, namely: raw Pearson R value, cleaned Pearson R value, raw Pearson P-value, Euclidean distance, co-peak, and co-apex. Optionally, one or more of these can be disabled.

Usage

calculate_features(
  profile_matrix,
  gaussians,
  min_pairs = 0,
  pearson_R_raw = TRUE,
  pearson_R_cleaned = TRUE,
  pearson_P = TRUE,
  euclidean_distance = TRUE,
  co_peak = TRUE,
  co_apex = TRUE,
  n_pairs = FALSE,
  max_euclidean_quantile = 0.9
)

Arguments

profile_matrix

a numeric matrix of co-elution profiles, with proteins in rows, or a MSnSet object

gaussians

a list of Gaussian mixture models fit to the profile matrix by link{build_gaussians}

min_pairs

minimum number of overlapping fractions between any given protein pair to consider a potential interaction

pearson_R_raw

if true, include the Pearson correlation (R) between raw profiles as a feature

pearson_R_cleaned

if true, include the Pearson correlation (R) between cleaned profiles as a feature

pearson_P

if true, include the P-value of the Pearson correlation between raw profiles as a feature

euclidean_distance

if true, include the Euclidean distance between cleaned profiles as a feature

co_peak

if true, include the 'co-peak score' (that is, the distance, in fractions, between the single highest value of each profile) as a feature

co_apex

if true, include the 'co-apex score' (that is, the minimum Euclidean distance between any pair of fit Gaussians) as a feature

max_euclidean_quantile

very high Euclidean distance values are trimmed to avoid numerical precision issues; values above this quantile will be replaced with the value at this quantile (default: 0.9)

Value

a data frame containing the calculated features for all possible protein pairs


[Package PrInCE version 1.8.0 Index]