Skip to contents

A function that takes as input data in long format, and shows how the signal intensity value of markers are distributed.

Usage

markers_distribution(
  data_long,
  min_SE = 0,
  min_SP = 0,
  x_lim = NULL,
  y_lim = NULL,
  boxplot_lim = NULL,
  signalthr_prediction = FALSE,
  case_class
)

Arguments

data_long

a data.frame in long format returned by combiroc_long()

min_SE

a numeric that specifies the min value of SE that a threshold must have to be included in $Coord.

min_SP

a numeric that specifies the min value of SP that a threshold must have to be included in $Coord.

x_lim

a numeric setting the max values of x that will be visualized in the density plot (zoom only, no data loss).

y_lim

a numeric setting the max values of y that will be visualized in the density plot (zoom only, no data loss).

boxplot_lim

a numeric setting the max values of y that will be visualized in the boxplot (zoom only, no data loss).

signalthr_prediction

a boolean that specifies if the density plot will also show the "suggested signal threshold".

case_class

a character that specifies which of the two classes of the dataset is the case class.

Value

a named list containing 'Coord' and 'Density_summary' data.frames, and 'ROC', 'Boxplot' and 'Density_plot' plot objects.

Details

This function returns a named list containing the following objects:

  • “Density_plot”: a density plot showing the distribution of the signal intensity values for both the classes.

  • "Density_summary": a data.frame showing a summary statistics of the distributions.

  • “ROC”: a ROC curve showing how many real positive samples would be found positive (SE) and how many real negative samples would be found negative (SP) in function of signal threshold. NB: these SE and SP are refereed to the signal intensity threshold considering all the markers together; it is NOT equal to the SE/SP of a single marker/combination found with se_sp().

  • “Coord”: a data.frame that contains the coordinates of the above described “ROC” (threshold, SP and SE) that have at least a min SE (40 by default) and a min SP (80 by default).

  • "Boxplot": a boxplot showing the distribution of the signal intensity values of each marker singularly, for both the classes.

In case of lack of a priori known threshold the user can set set signalthr_prediction= TRUE. In this way the function provides a "suggested signal threshold" that corresponds to the median of the singnal threshold values (in "Coord") at which SE/SP are grater or equal to their set minimal values (min_SE and min_SP), and it adds this threshold on the "Density_plot" object as a dashed black line. The use of the median allows to pick a threshold whose SE/SP are not too close to the limits (min_SE and min_SP), but it is recommended to always inspect "Coord" and choose the most appropriate signal threshold by considering SP, SE and Youden index.

Examples

if (FALSE) {
demo_data # combiroc built-in demo data (proteomics data from Zingaretti et al. 2012 - PMC3518104)

demo_data_long <- combiroc_long(data = demo_data) # long format data




# To visualize the distribution of the expression of each marker.

distributions <- markers_distribution(data_long = demo_data_long,
                                       boxplot_lim = 1500, y_lim = 0.001,
                                       x_lim = 3000 , signalthr_prediction = FALSE,
                                       case_class = 'A', min_SE = 40, min_SP = 80)

distributions$Density_plot # density plot
distributions$Density_summary # summary statistics of density plot
distributions$ROC # ROC showing signal threshold range ensuring min SE and/or SP
distributions$Coord # ROC values
distributions$Boxplot # Boxplot
}