R/mutate_variable_na.R
mutate_variable_na_freq.Rd
This function adds a new column to the variable_info
slot of a mass_dataset object,
which contains the frequency of NA (Not Available) values for each variable according to the samples specified.
mutate_variable_na_freq(object, according_to_samples = "all")
A modified mass_dataset object with an updated variable_info
slot.
data("expression_data")
data("sample_info")
data("variable_info")
object =
create_mass_dataset(
expression_data = expression_data,
sample_info = sample_info,
variable_info = variable_info,
)
object
#> --------------------
#> massdataset version: 1.0.33
#> --------------------
#> 1.expression_data:[ 1000 x 8 data.frame]
#> 2.sample_info:[ 8 x 4 data.frame]
#> 8 samples:Blank_3 Blank_4 QC_1 ... PS4P3 PS4P4
#> 3.variable_info:[ 1000 x 3 data.frame]
#> 1000 variables:M136T55_2_POS M79T35_POS M307T548_POS ... M232T937_POS M301T277_POS
#> 4.sample_info_note:[ 4 x 2 data.frame]
#> 5.variable_info_note:[ 3 x 2 data.frame]
#> 6.ms2_data:[ 0 variables x 0 MS2 spectra]
#> --------------------
#> Processing information
#> 1 processings in total
#> create_mass_dataset ----------
#> Package Function.used Time
#> 1 massdataset create_mass_dataset() 2024-09-06 08:49:54
##calculate NA frequency according to all the samples
object2 =
mutate_variable_na_freq(object = object)
head(extract_variable_info(object))
#> variable_id mz rt
#> 1 M136T55_2_POS 136.06140 54.97902
#> 2 M79T35_POS 79.05394 35.36550
#> 3 M307T548_POS 307.14035 547.56641
#> 4 M183T224_POS 183.06209 224.32777
#> 5 M349T47_POS 349.01584 47.00262
#> 6 M182T828_POS 181.99775 828.35712
head(extract_variable_info(object2))
#> variable_id mz rt na_freq
#> 1 M136T55_2_POS 136.06140 54.97902 0.250
#> 2 M79T35_POS 79.05394 35.36550 0.250
#> 3 M307T548_POS 307.14035 547.56641 0.375
#> 4 M183T224_POS 183.06209 224.32777 0.750
#> 5 M349T47_POS 349.01584 47.00262 0.250
#> 6 M182T828_POS 181.99775 828.35712 0.125
##calculate NA number according to only QC samples
object3 =
mutate_variable_na_freq(object = object2,
according_to_samples =
get_sample_id(object)[extract_sample_info(object)$class == "QC"])
object3
#> --------------------
#> massdataset version: 1.0.33
#> --------------------
#> 1.expression_data:[ 1000 x 8 data.frame]
#> 2.sample_info:[ 8 x 4 data.frame]
#> 8 samples:Blank_3 Blank_4 QC_1 ... PS4P3 PS4P4
#> 3.variable_info:[ 1000 x 5 data.frame]
#> 1000 variables:M136T55_2_POS M79T35_POS M307T548_POS ... M232T937_POS M301T277_POS
#> 4.sample_info_note:[ 4 x 2 data.frame]
#> 5.variable_info_note:[ 5 x 2 data.frame]
#> 6.ms2_data:[ 0 variables x 0 MS2 spectra]
#> --------------------
#> Processing information
#> 2 processings in total
#> create_mass_dataset ----------
#> Package Function.used Time
#> 1 massdataset create_mass_dataset() 2024-09-06 08:49:54
#> mutate_variable_na_freq ----------
#> Package Function.used Time
#> 1 massdataset mutate_variable_na_freq() 2024-09-06 08:49:54.626769
#> 2 massdataset mutate_variable_na_freq() 2024-09-06 08:49:54.629563
head(extract_variable_info(object3))
#> variable_id mz rt na_freq na_freq.1
#> 1 M136T55_2_POS 136.06140 54.97902 0.250 0.0
#> 2 M79T35_POS 79.05394 35.36550 0.250 0.0
#> 3 M307T548_POS 307.14035 547.56641 0.375 0.0
#> 4 M183T224_POS 183.06209 224.32777 0.750 1.0
#> 5 M349T47_POS 349.01584 47.00262 0.250 0.0
#> 6 M182T828_POS 181.99775 828.35712 0.125 0.5