Main function for normalizing quantitative data in a tidyproteomics data-object

normalize() Main function for normalizing quantitative data from a tidyproteomics data-object. This is a passthrough function as it returns the original tidyproteomics data-object with an additional quantitative column labeled with the normalization method(s) used.

This function can accommodate multiple normalization methods in a single pass, and it is useful for examining normalization effects on data. Often it is adventitious to select a optimal normalization method based on performance.

Usage

normalize(
  data,
  ...,
  .method = c("scaled", "median", "linear", "limma", "loess", "svm", "randomforest"),
  .cores = 1
)

Arguments

data: tidyproteomics data object
...: use a subset of the data for normalization see subset(). This is useful when normalizing against a spike-in set of proteins
.method: character vector of normalization to use
.cores: number of CPU cores to use for multi-threading

Value

a tidyproteomics data-object

Examples

library(dplyr, warn.conflicts = FALSE)
library(tidyproteomics)
hela_proteins %>%
     normalize(.method = c("scaled", "median")) %>%
     summary("sample")
#> ℹ Normalizing quantitative data
#> ℹ ... using scaled shift
#> ✔ ... using scaled shift [140ms]
#> 
#> ℹ ... using median shift
#> ✔ ... using median shift [155ms]
#> 
#> ℹ Selecting best normalization method
#> ✔ Selecting best normalization method ... done
#> 
#> ℹ  ... selected scaled
#> 
#> ── Summary: sample ──
#> 
#>     sample proteins peptides peptides_unique quantifiable  CVs
#>    control     7055    66329           58706        0.908 0.15
#>  knockdown     7055    66329           58706        0.909 0.13
#> 

# normalize between samples according to a subset, then apply to all values
#   this would be recommended with a pull-down experiment wherein a conserved
#   protein complex acts as the majority content and individual inter-actors
#   are of quantitative differentiation
hela_proteins %>%
     normalize(!description %like% "Ribosome", .method = c("scaled", "median")) %>%
     summary("sample")
#> !   normalization based on 5329 of 5346 identifiers
#> ℹ Normalizing quantitative data
#> ℹ ... using scaled shift
#> ✔ ... using scaled shift [149ms]
#> 
#> ℹ ... using median shift
#> ✔ ... using median shift [154ms]
#> 
#> ℹ Selecting best normalization method
#> ✔ Selecting best normalization method ... done
#> 
#> ℹ  ... selected scaled
#> 
#> ── Summary: sample ──
#> 
#>     sample proteins peptides peptides_unique quantifiable  CVs
#>    control     7055    66329           58706        0.908 0.15
#>  knockdown     7055    66329           58706        0.909 0.13
#>