Importing Demo • tidyproteomics

The following is a demonstration workflow for importing data from a public repository. For an additional importing demonstration see vignette("subsetting"). All the of examples presented here are from previously published data where the raw data was analyzed on each platform with basic settings for mass tolerance and PTMs.

ProteomeDiscoverer

Exporting the data requires some setup for the correct columns. This is explained in vignette("importing").

library(tidyverse)
library(tidyproteomics)

# download the data
url <- "https://data.caltech.edu/records/aevwq-2ps50/files/ProteomeDiscoverer_2.5_p97KD_HCT116_proteins.xlsx?download=1"
download.file(url, destfile = "./data/pd_proteins.xlsx")

# import the data
data_prot <- "./data/pd_proteins.xlsx" %>% import('ProteomeDiscoverer', 'proteins')

#> Origin          ProteomeDiscoverer 
#>                 proteins (10.67 MB) 
#> Composition     6 files 
#>                 2 samples (ctl, p97_kd) 
#> Quantitation    7055 proteins 
#>                 4 log10 dynamic range 
#>                 28.8% missing values 
#>  *imputed        
#> Accounting      (4) num_peptides num_psms num_unique_peptides imputed 
#> Annotations     (9) description biological_process cellular_component molecular_function
#>                 gene_id_entrez gene_name wiki_pathway reactome_pathway
#>                 gene_id_ensemble 
#>

MaxQuant

Exporting the data requires some setup for the correct columns. This is explained in vignette("importing").

library(tidyverse)
library(tidyproteomics)

# download the data
url <- "https://data.caltech.edu/records/aevwq-2ps50/files/MaxQuant_1.6.10.43_proteinGroups.txt?download=1"
download.file(url, destfile = "./data/mq_proteinGroups.txt")

# import the data
data_prot <- "./data/mq_proteinGroups.txt" %>% 
  import('MaxQuant', 'proteins')

#> Origin          MaxQuant 
#>                 proteins (4.78 MB) 
#> Composition     6 files 
#>                 6 samples (ctrl_1, ctrl_2, ctrl_3, p97_kd_1, p97_kd_2, p97_kd_3) 
#> Quantitation    6452 proteins 
#>                 4.3 log10 dynamic range 
#>                 10.4% missing values 
#>  *imputed        
#> Accounting      (4) num_psms num_peptides num_unique_peptides imputed 
#>

FragPipe

Exporting the data requires some setup for the correct columns. This is explained in vignette("importing").

library(tidyverse)
library(tidyproteomics)

# download the data
url <- "https://data.caltech.edu/records/aevwq-2ps50/files/FragPipe_19.1_combined_protein.tsv?download=1"
download.file(url, destfile = "./data/fp_combined_protein.tsv")

# import the data
data_prot <- "./data/fp_combined_protein.tsv" %>% 
  import('FragPipe', 'proteins')

#> Origin          FragPipe 
#>                 proteins (6.33 MB) 
#> Composition     6 files 
#>                 2 samples (control, knockdown) 
#> Quantitation    6670 proteins 
#>                 3 log10 dynamic range 
#>                 27.3% missing values 
#>  *imputed        
#> Accounting      (3) num_psms num_psms_unique imputed 
#> Annotations     (2) gene_name description 
#>