Main method for imputing missing values
impute.Rd
Main method for imputing missing values
Arguments
- data
a tidyproteomics list data-object
- .function
summary statistic function. Default is base::min, examples of other functions include min, max, mean, sum. Note, NAs will be be removed in the function call.
- method
a character string to indicate the imputation method (row, column, matrix). Consider a data matrix of peptide/protein "rows" and dataset "columns". A 'row' functions by imputing values between samples looking at the values for a given peptide/protein, while the 'column' method imputes within a dataset of values. The function 'randomforest' imputes using data from all rows and columns, or the "matrix", without bias toward sample groups. If given a bias for sample groups, expression differences would also bias sample groups. If it is the case that sample groups should be biased (such as gene deletion), then it is suggested to impute using min function and the 'within' method.
- group_by_sample
a boolean to indicate that the data should be grouped by sample name to bias the imputation to within that sample.
- cores
the number of threads used to speed the calculation
Examples
library(dplyr, warn.conflicts = FALSE)
library(tidyproteomics)
hela_proteins %>% summary("sample")
#>
#> ── Summary: sample ──
#>
#> sample proteins peptides peptides_unique quantifiable CVs
#> control 7055 66329 58706 0.908 0.16
#> knockdown 7055 66329 58706 0.909 0.21
#>
hela_proteins %>% impute(.function = stats::median) %>% summary("sample")
#> ℹ Imputing by row using the function base::quote function (x, na.rm = FALSE, ..…
#> ✔ Imputing by row using the function base::quote function (x, na.rm = FALSE, ..…
#>
#> ℹ ... 1919 values imputed
#>
#> ── Summary: sample ──
#>
#> sample proteins peptides peptides_unique quantifiable CVs
#> control 7055 66329 58706 0.931 0.16
#> knockdown 7055 66329 58706 0.931 0.20
#>
hela_proteins %>% impute(.function = impute.randomforest) %>% summary("sample")
#> ℹ Imputing by row using the function base::quote function (matrix = NULL, cores…
#> Error in .f(as.vector(stats::na.omit(x))): imput data must be a matrix
#> ✖ Imputing by row using the function base::quote function (matrix = NULL, cores…
#>