Skip to contents

normalize() Main function for normalizing quantitative data from a tidyproteomics data-object. This is a passthrough function as it returns the original tidyproteomics data-object with an additional quantitative column labeled with the normalization method(s) used.

This function can accommodate multiple normalization methods in a single pass, and it is useful for examining normalization effects on data. Often it is adventitious to select a optimal normalization method based on performance.

Usage

normalize(
  data,
  ...,
  .method = c("scaled", "median", "linear", "limma", "loess", "svm", "randomforest"),
  .cores = 1
)

Arguments

data

tidyproteomics data object

...

use a subset of the data for normalization see subset(). This is useful when normalizing against a spike-in set of proteins

.method

character vector of normalization to use

.cores

number of CPU cores to use for multi-threading

Value

a tidyproteomics data-object

Examples

library(dplyr, warn.conflicts = FALSE)
library(tidyproteomics)
hela_proteins %>%
     normalize(.method = c("scaled", "median")) %>%
     summary("sample")
#>  Normalizing quantitative data
#>  ... using scaled shift
#>  ... using scaled shift [131ms]
#> 
#>  ... using median shift
#>  ... using median shift [150ms]
#> 
#>  Selecting best normalization method
#>  Selecting best normalization method ... done
#> 
#>   ... selected scaled
#> 
#> ── Summary: sample ──
#> 
#>     sample proteins peptides peptides_unique quantifiable  CVs
#>    control     7055    66329           58706        0.908 0.15
#>  knockdown     7055    66329           58706        0.909 0.13
#> 

# normalize between samples according to a subset, then apply to all values
#   this would be recommended with a pull-down experiment wherein a conserved
#   protein complex acts as the majority content and individual inter-actors
#   are of quantitative differentiation
hela_proteins %>%
     normalize(!description %like% "Ribosome", .method = c("scaled", "median")) %>%
     summary("sample")
#> !   normalization based on 5329 of 5346 identifiers
#>  Normalizing quantitative data
#>  ... using scaled shift
#>  ... using scaled shift [131ms]
#> 
#>  ... using median shift
#>  ... using median shift [120ms]
#> 
#>  Selecting best normalization method
#>  Selecting best normalization method ... done
#> 
#>   ... selected scaled
#> 
#> ── Summary: sample ──
#> 
#>     sample proteins peptides peptides_unique quantifiable  CVs
#>    control     7055    66329           58706        0.908 0.15
#>  knockdown     7055    66329           58706        0.909 0.13
#>