Skip to contents

enrichment() is an analysis function that computes the protein summary statistics for a given tidyproteomics data object.

Usage

enrichment(
  data = NULL,
  ...,
  .pairs = NULL,
  .term = NULL,
  .method = c("gsea", "wilcoxon"),
  .score_type = c("std", "pos", "neg"),
  .cpu_cores = 1
)

Arguments

data

tidyproteomics data object

...

two sample comparison e.g. experimental/control

.pairs

a list of vectors each containing two named sample groups

.term

a character string referencing ".term" in the annotations table

.method

a character string

.score_type

a character string. From the fgsea manual: "This parameter defines the GSEA score type. Possible options are ("std", "pos", "neg"). By default ("std") the enrichment score is computed as in the original GSEA. The "pos" and "neg" score types are intended to be used for one-tailed tests (i.e. when one is interested only in positive ("pos") or negateive ("neg") enrichment)."

.cpu_cores

the number of threads used to speed the calculation

Value

a tibble

Examples

library(dplyr, warn.conflicts = FALSE)
library(tidyproteomics)

# using the default GSEA method
hela_proteins %>%
   expression(knockdown/control) %>%
   enrichment(knockdown/control, .term = "biological_process") %>%
   export_analysis(knockdown/control, .analysis = "enrichment", .term = "biological_process")
#>  .. expression::t_test testing knockdown / control
#>  .. expression::t_test testing knockdown / control [3.2s]
#> 
#>  .. enrichment::gsea testing knockdown / control by term biological_process
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#>  .. enrichment::gsea testing knockdown / control by term biological_process [1
#> 
#> # A tibble: 13 × 7
#>    annotation p_value adj_p_value enrichment enrichment_normalized log2err  size
#>    <chr>        <dbl>       <dbl>      <dbl>                 <dbl>   <dbl> <int>
#>  1 cell prol…  0.0221       0.287      0.316                 1.18  0.352     301
#>  2 cell diff…  0.0949       1          0.283                 1.08  0.141     551
#>  3 cellular …  0.211        1          0.270                 1.04  0.0884   1015
#>  4 coagulati…  0.437        1          0.261                 1.01  0.0518    961
#>  5 cell comm…  0.630        1          0.265                 0.961 0.0349    157
#>  6 cell death  0.735        1          0.255                 0.934 0.0274    169
#>  7 cellular …  0.807        1          0.246                 0.957 0.0223   2944
#>  8 cell orga…  0.872        1          0.244                 0.944 0.0175   1373
#>  9 defense r…  0.947        1          0.235                 0.908 0.0108    848
#> 10 developme…  0.986        1          0.229                 0.882 0.00543   879
#> 11 metabolic…  0.986        1          0.231                 0.898 0.00543  3179
#> 12 cell grow…  0.999        1          0.221                 0.863 0.00144  1782
#> 13 conjugati…  1            1          0.206                 0.803 0        1227

# using a Wilcoxon Rank Sum method
hela_proteins %>%
   expression(knockdown/control) %>%
   enrichment(knockdown/control, .term = "biological_process", .method = "wilcoxon") %>%
   export_analysis(knockdown/control, .analysis = "enrichment", .term = "biological_process")
#>  .. expression::t_test testing knockdown / control
#>  .. expression::t_test testing knockdown / control [3.1s]
#> 
#>  .. enrichment::wilcoxon testing knockdown / control by term biological_process
#>  annotation other had issues, not reported
#>  .. enrichment::wilcoxon testing knockdown / control by term biological_process

#>  .. enrichment::wilcoxon testing knockdown / control by term biological_proces
#> 
#> # A tibble: 13 × 5
#>    annotation                          p_value adj_p_value enrichment  size
#>    <chr>                                 <dbl>       <dbl>      <dbl> <int>
#>  1 conjugation                      0.00000228   0.0000296      1.12   1227
#>  2 cell proliferation               0.000598     0.00717        0.812   301
#>  3 cell organization and biogenesis 0.00580      0.0638         1.06   1373
#>  4 development                      0.00902      0.0902         1.10    879
#>  5 cellular component movement      0.00958      0.0902         0.933  1015
#>  6 defense response                 0.0219       0.175          1.08    848
#>  7 metabolic process                0.0789       0.552          1.03   3179
#>  8 coagulation                      0.139        0.833          1.08    961
#>  9 cell differentiation             0.149        0.833          0.953   551
#> 10 cell growth                      0.163        0.833          1.06   1782
#> 11 cell communication               0.431        1              1.07    157
#> 12 cellular homeostasis             0.617        1              1.00   2944
#> 13 cell death                       0.798        1              1.05    169

# using the .pairs argument when multiple comparisons are needed
comps <- list(c("control","knockdown"),
            c("knockdown","control"))

hela_proteins %>%
   expression(.pairs = comps) %>%
   enrichment(.pairs = comps, .term = "biological_process")
#> Using the supplied 2 sample pairs ...
#>  .. expression::t_test testing control / knockdown
#>  .. expression::t_test testing control / knockdown [3s]
#> 
#>  .. expression::t_test testing knockdown / control
#>  .. expression::t_test testing knockdown / control [3.1s]
#> 
#> Using the supplied 2 sample pairs ...
#>  .. enrichment::gsea testing control / knockdown by term biological_process
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#>  .. enrichment::gsea testing control / knockdown by term biological_process [1
#> 
#>  .. enrichment::gsea testing knockdown / control by term biological_process
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#>  .. enrichment::gsea testing knockdown / control by term biological_process [1
#> 
#> 
#> ── Quantitative Proteomics Data Object ──
#> 
#> Origin          ProteomeDiscoverer 
#>                 proteins (11.97 MB) 
#> Composition     6 files 
#>                 2 samples (control, knockdown) 
#> Quantitation    7055 proteins 
#>                 4 log10 dynamic range 
#>                 28.8% missing values 
#>  *imputed        
#> Accounting      (4) num_peptides num_psms num_unique_peptides imputed 
#> Annotations     (9) description biological_process cellular_component molecular_function
#>                 gene_id_entrez gene_name wiki_pathway reactome_pathway
#>                 gene_id_ensemble 
#> Analyses        (2) 
#>                 control/knockdown -> expression & enrichment (biological_process) 
#>                 knockdown/control -> expression & enrichment (biological_process) 
#>