Importing Demo
workflow-importing.Rmd
The following is a demonstration workflow for importing data from a
public repository. For an additional importing demonstration see
vignette("subsetting")
. All the of examples presented here
are from previously published data where the raw data was analyzed on
each platform with basic settings for mass tolerance and PTMs.
ProteomeDiscoverer
Exporting the data requires some setup for the correct columns. This
is explained in vignette("importing")
.
library(tidyverse)
library(tidyproteomics)
# download the data
url <- "https://data.caltech.edu/records/aevwq-2ps50/files/ProteomeDiscoverer_2.5_p97KD_HCT116_proteins.xlsx?download=1"
download.file(url, destfile = "./data/pd_proteins.xlsx")
# import the data
data_prot <- "./data/pd_proteins.xlsx" %>% import('ProteomeDiscoverer', 'proteins')
#> Origin ProteomeDiscoverer
#> proteins (10.67 MB)
#> Composition 6 files
#> 2 samples (ctl, p97_kd)
#> Quantitation 7055 proteins
#> 4 log10 dynamic range
#> 28.8% missing values
#> *imputed
#> Accounting (4) num_peptides num_psms num_unique_peptides imputed
#> Annotations (9) description biological_process cellular_component molecular_function
#> gene_id_entrez gene_name wiki_pathway reactome_pathway
#> gene_id_ensemble
#>
MaxQuant
Exporting the data requires some setup for the correct columns. This
is explained in vignette("importing")
.
library(tidyverse)
library(tidyproteomics)
# download the data
url <- "https://data.caltech.edu/records/aevwq-2ps50/files/MaxQuant_1.6.10.43_proteinGroups.txt?download=1"
download.file(url, destfile = "./data/mq_proteinGroups.txt")
# import the data
data_prot <- "./data/mq_proteinGroups.txt" %>%
import('MaxQuant', 'proteins')
#> Origin MaxQuant
#> proteins (4.78 MB)
#> Composition 6 files
#> 6 samples (ctrl_1, ctrl_2, ctrl_3, p97_kd_1, p97_kd_2, p97_kd_3)
#> Quantitation 6452 proteins
#> 4.3 log10 dynamic range
#> 10.4% missing values
#> *imputed
#> Accounting (4) num_psms num_peptides num_unique_peptides imputed
#>
FragPipe
Exporting the data requires some setup for the correct columns. This
is explained in vignette("importing")
.
library(tidyverse)
library(tidyproteomics)
# download the data
url <- "https://data.caltech.edu/records/aevwq-2ps50/files/FragPipe_19.1_combined_protein.tsv?download=1"
download.file(url, destfile = "./data/fp_combined_protein.tsv")
# import the data
data_prot <- "./data/fp_combined_protein.tsv" %>%
import('FragPipe', 'proteins')
#> Origin FragPipe
#> proteins (6.33 MB)
#> Composition 6 files
#> 2 samples (control, knockdown)
#> Quantitation 6670 proteins
#> 3 log10 dynamic range
#> 27.3% missing values
#> *imputed
#> Accounting (3) num_psms num_psms_unique imputed
#> Annotations (2) gene_name description
#>