Skip to contents

A customized read.table() function that checks the conformity of the dataset format, and only if all checks are passed, loads it.

Usage

load_data(data, sep = ";", na.strings = "", labelled_data = TRUE)

Arguments

data

the name of the file which the data are to be read from.

sep

the field separator character.

na.strings

a character vector of strings which are to be interpreted as NA values.

labelled_data

a boolean that specifies whether the combiroc data to be loaded is labelled (with 'Class' column) or not.

Value

a data frame (data.frame) containing a representation of the data in the file.

Details

The dataset to be analysed should be in text format, which can be comma, tab or semicolon separated:

  • The 1st column must contain patient/sample IDs as characters.

  • If dataset is labelled, the 2nd column must contain the class to which each sample belongs.

  • The classes must be exactly 2 and they must be written in character format.

  • From the 3rd column on (2nd if dataset is unlabelled), the dataset must contain numerical values that represent the signal corresponding to the markers abundance in each sample (marker-related columns).

  • Marker-related columns can be called 'Marker1, Marker2, Marker3, ...' or can be called directly with the gene/protein name, but "-" is not allowed in the column name. Only if all the checks are passed, it reorders alphabetically the marker-related columns depending on marker names (necessary for a proper computation of combinations), and it forces "Class" as 2nd column name.

Examples

if (FALSE) {
demo_data # combiroc built-in demo data (proteomics data from Zingaretti et al. 2012 - PMC3518104)

# save a data.frame as a csv to be load by combiroc package
file<- tempfile()
write.csv2(demo_data, file = file, row.names = FALSE)


#To load a csv file if correctly formatted

demo_data <- load_data(data = file, sep = ';', na.strings = "")


demo_unclassified_data # combiroc built-in unclassified demo data

# save a data.frame as a csv to be load by combiroc package
file<- tempfile()
write.csv2(demo_unclassified_data, file = file, row.names = FALSE)

# To load an unclassified dataset.

demo_unclassified_data <- load_data(data = file, sep = ';', na.strings = "", labelled_data = FALSE)
}