Compute protein enrichment
enrichment.Rd
enrichment()
is an analysis function that computes the protein summary
statistics for a given tidyproteomics data object.
Arguments
- data
tidyproteomics data object
- ...
two sample comparison e.g. experimental/control
- .pairs
a list of vectors each containing two named sample groups
- .term
a character string referencing ".term" in the annotations table
- .method
a character string
- .score_type
a character string. From the fgsea manual: "This parameter defines the GSEA score type. Possible options are ("std", "pos", "neg"). By default ("std") the enrichment score is computed as in the original GSEA. The "pos" and "neg" score types are intended to be used for one-tailed tests (i.e. when one is interested only in positive ("pos") or negateive ("neg") enrichment)."
- .cpu_cores
the number of threads used to speed the calculation
Examples
library(dplyr, warn.conflicts = FALSE)
library(tidyproteomics)
# using the default GSEA method
hela_proteins %>%
expression(knockdown/control) %>%
enrichment(knockdown/control, .term = "biological_process") %>%
export_analysis(knockdown/control, .analysis = "enrichment", .term = "biological_process")
#> ℹ .. expression::t_test testing knockdown / control
#> ✔ .. expression::t_test testing knockdown / control [3.2s]
#>
#> ℹ .. enrichment::gsea testing knockdown / control by term biological_process
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> ✔ .. enrichment::gsea testing knockdown / control by term biological_process [1…
#>
#> # A tibble: 13 × 7
#> annotation p_value adj_p_value enrichment enrichment_normalized log2err size
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 cell prol… 0.0221 0.287 0.316 1.18 0.352 301
#> 2 cell diff… 0.0949 1 0.283 1.08 0.141 551
#> 3 cellular … 0.211 1 0.270 1.04 0.0884 1015
#> 4 coagulati… 0.437 1 0.261 1.01 0.0518 961
#> 5 cell comm… 0.630 1 0.265 0.961 0.0349 157
#> 6 cell death 0.735 1 0.255 0.934 0.0274 169
#> 7 cellular … 0.807 1 0.246 0.957 0.0223 2944
#> 8 cell orga… 0.872 1 0.244 0.944 0.0175 1373
#> 9 defense r… 0.947 1 0.235 0.908 0.0108 848
#> 10 developme… 0.986 1 0.229 0.882 0.00543 879
#> 11 metabolic… 0.986 1 0.231 0.898 0.00543 3179
#> 12 cell grow… 0.999 1 0.221 0.863 0.00144 1782
#> 13 conjugati… 1 1 0.206 0.803 0 1227
# using a Wilcoxon Rank Sum method
hela_proteins %>%
expression(knockdown/control) %>%
enrichment(knockdown/control, .term = "biological_process", .method = "wilcoxon") %>%
export_analysis(knockdown/control, .analysis = "enrichment", .term = "biological_process")
#> ℹ .. expression::t_test testing knockdown / control
#> ✔ .. expression::t_test testing knockdown / control [3.1s]
#>
#> ℹ .. enrichment::wilcoxon testing knockdown / control by term biological_process
#> ℹ annotation other had issues, not reported
#> ℹ .. enrichment::wilcoxon testing knockdown / control by term biological_process
#> ✔ .. enrichment::wilcoxon testing knockdown / control by term biological_proces…
#>
#> # A tibble: 13 × 5
#> annotation p_value adj_p_value enrichment size
#> <chr> <dbl> <dbl> <dbl> <int>
#> 1 conjugation 0.00000228 0.0000296 1.12 1227
#> 2 cell proliferation 0.000598 0.00717 0.812 301
#> 3 cell organization and biogenesis 0.00580 0.0638 1.06 1373
#> 4 development 0.00902 0.0902 1.10 879
#> 5 cellular component movement 0.00958 0.0902 0.933 1015
#> 6 defense response 0.0219 0.175 1.08 848
#> 7 metabolic process 0.0789 0.552 1.03 3179
#> 8 coagulation 0.139 0.833 1.08 961
#> 9 cell differentiation 0.149 0.833 0.953 551
#> 10 cell growth 0.163 0.833 1.06 1782
#> 11 cell communication 0.431 1 1.07 157
#> 12 cellular homeostasis 0.617 1 1.00 2944
#> 13 cell death 0.798 1 1.05 169
# using the .pairs argument when multiple comparisons are needed
comps <- list(c("control","knockdown"),
c("knockdown","control"))
hela_proteins %>%
expression(.pairs = comps) %>%
enrichment(.pairs = comps, .term = "biological_process")
#> Using the supplied 2 sample pairs ...
#> ℹ .. expression::t_test testing control / knockdown
#> ✔ .. expression::t_test testing control / knockdown [3s]
#>
#> ℹ .. expression::t_test testing knockdown / control
#> ✔ .. expression::t_test testing knockdown / control [3.1s]
#>
#> Using the supplied 2 sample pairs ...
#> ℹ .. enrichment::gsea testing control / knockdown by term biological_process
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> ✔ .. enrichment::gsea testing control / knockdown by term biological_process [1…
#>
#> ℹ .. enrichment::gsea testing knockdown / control by term biological_process
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> ✔ .. enrichment::gsea testing knockdown / control by term biological_process [1…
#>
#>
#> ── Quantitative Proteomics Data Object ──
#>
#> Origin ProteomeDiscoverer
#> proteins (11.97 MB)
#> Composition 6 files
#> 2 samples (control, knockdown)
#> Quantitation 7055 proteins
#> 4 log10 dynamic range
#> 28.8% missing values
#> *imputed
#> Accounting (4) num_peptides num_psms num_unique_peptides imputed
#> Annotations (9) description biological_process cellular_component molecular_function
#> gene_id_entrez gene_name wiki_pathway reactome_pathway
#> gene_id_ensemble
#> Analyses (2)
#> control/knockdown -> expression & enrichment (biological_process)
#> knockdown/control -> expression & enrichment (biological_process)
#>