A customized read.table() function that checks the conformity of the dataset format, and only if all checks are passed, loads it.
Arguments
- data
the name of the file which the data are to be read from.
- sep
the field separator character.
- na.strings
a character vector of strings which are to be interpreted as NA values.
- labelled_data
a boolean that specifies whether the combiroc data to be loaded is labelled (with 'Class' column) or not.
Details
The dataset to be analysed should be in text format, which can be comma, tab or semicolon separated:
The 1st column must contain patient/sample IDs as characters.
If dataset is labelled, the 2nd column must contain the class to which each sample belongs.
The classes must be exactly 2 and they must be written in character format.
From the 3rd column on (2nd if dataset is unlabelled), the dataset must contain numerical values that represent the signal corresponding to the markers abundance in each sample (marker-related columns).
Marker-related columns can be called 'Marker1, Marker2, Marker3, ...' or can be called directly with the gene/protein name, but "-" is not allowed in the column name. Only if all the checks are passed, it reorders alphabetically the marker-related columns depending on marker names (necessary for a proper computation of combinations), and it forces "Class" as 2nd column name.
Examples
if (FALSE) {
demo_data # combiroc built-in demo data (proteomics data from Zingaretti et al. 2012 - PMC3518104)
# save a data.frame as a csv to be load by combiroc package
file<- tempfile()
write.csv2(demo_data, file = file, row.names = FALSE)
#To load a csv file if correctly formatted
demo_data <- load_data(data = file, sep = ';', na.strings = "")
demo_unclassified_data # combiroc built-in unclassified demo data
# save a data.frame as a csv to be load by combiroc package
file<- tempfile()
write.csv2(demo_unclassified_data, file = file, row.names = FALSE)
# To load an unclassified dataset.
demo_unclassified_data <- load_data(data = file, sep = ';', na.strings = "", labelled_data = FALSE)
}