caROC
is an R package
devoted to the assessment of continuous biomarkers. The metrics
considered include specificity at contolled sensitivity
level, sensitivity at controlled specificity
level, and receiver operating characteristic
(ROC) curve. If evaluation in specific sub-population is interested, all
these statistics can also be computed in the version sub-population
specific analysis. We allow both categorical and continuous covariates
to be adjusted in computing these metrics.
Any caROC questions should be posted to the GitHub Issue section of caROC homepage at https://github.com/ziyili20/caROC/issues.
library(caROC)
## get specificity at controlled sensitivity levels 0.2, 0.8, 0.9
caROC(diseaseData,controlData,formula,
control_sensitivity = c(0.2,0.8, 0.9),
control_specificity = NULL)
## get covariate-adjusted ROC curve with curve-based monotonizing method
curveROC <- caROC(diseaseData,controlData,formula,
mono_resp_method = "curve",
verbose = FALSE)
The tutorial is based on a simulation dataset:
library(caROC)
### n1: number of cases
### n0: number of controls
n1 = n0 = 1000
## Z_D and Z_C are the covariates in the disease and control groups
Z_D1 <- rbinom(n1, size = 1, prob = 0.3)
Z_D2 <- rnorm(n1, 0.8, 1)
Z_C1 <- rbinom(n0, size = 1, prob = 0.7)
Z_C2 <- rnorm(n0, 0.8, 1)
Y_C_Z0 <- rnorm(n0, 0.1, 1)
Y_D_Z0 <- rnorm(n1, 1.1, 1)
Y_C_Z1 <- rnorm(n0, 0.2, 1)
Y_D_Z1 <- rnorm(n1, 0.9, 1)
## M0 and M1 are the outcome of interest (biomarker to be evaluated) in the control and disease groups
M0 <- Y_C_Z0 * (Z_C1 == 0) + Y_C_Z1 * (Z_C1 == 1) + Z_C2
M1 <- Y_D_Z0 * (Z_D1 == 0) + Y_D_Z1 * (Z_D1 == 1) + 1.5 * Z_D2
diseaseData <- data.frame(M = M1, Z1 = Z_D1, Z2 = Z_D2)
controlData <- data.frame(M = M0, Z1 = Z_C1, Z2 = Z_C2)
## we are interested in evaluating biomarker M while adjusting for covariate Z
userFormula = "M~Z1+Z2"
One can easily compute covariate-adjusted specificity at controlled
sensitivity levels by specifying control_sensitivity
and
leaving control_specificity
NULL.
mono_resp_method
is to choose which monotonicity
restoration method to use, “none” or “ROC”. whichSE
is to
choose how to compute standard error. It could be “boostrap” or
“numerical”, i.e. boostrap-based or sample-based SE. Try ?caROC to see
more details of these arguments.
caROC(diseaseData,controlData,userFormula,
control_sensitivity = c(0.2,0.8, 0.9),
control_specificity = NULL,
mono_resp_method = "ROC",
whichSE = "bootstrap",nbootstrap = 100,
CI_alpha = 0.95, logit_CI = TRUE)
#> $Estimate
#> [1] 0.954000 0.625000 0.476256
#>
#> $SE
#> [1] 0.009147915 0.027733989 0.026602383
#>
#> $ConfidenceInterval
#> 0.2 0.8 0.9
#> LogitCI_Lower 0.9323559 0.5692765 0.4245610
#> LogitCI_Upper 0.9689493 0.6775972 0.5284649
To compute covariate-adjusted sensitivity at controlled specificity
levels by specifying control_specificity
and leaving
control_sensitivity
NULL.
caROC(diseaseData,controlData,userFormula,
control_sensitivity = NULL,
control_specificity = c(0.7,0.8, 0.9),
mono_resp_method = "none",
whichSE = "sample",nbootstrap = 100,
CI_alpha = 0.95, logit_CI = TRUE)
#> $Estimate
#> [1] 0.716 0.614 0.488
#>
#> $SE
#> [1] 0.02129188 0.02116610 0.02865058
#>
#> $ConfidenceInterval
#> 0.7 0.8 0.9
#> LogitCI_Lower 0.6724927 0.5717805 0.4322308
#> LogitCI_Upper 0.7558262 0.6545717 0.5440695
Give the covariates of a subpopulation, we can also computed sensitivity at controlled specificity level.
target_covariates = c(1, 0.7, 0.9)
sscaROC(diseaseData,controlData,
userFormula = userFormula,
control_sensitivity = c(0.2,0.8, 0.9),
target_covariates = target_covariates,
control_specificity = NULL,
mono_resp_method = "none",
whichSE = "sample",nbootstrap = 100,
CI_alpha = 0.95, logit_CI = TRUE)
#> $Estimate
#> [1] 0.9776987 0.6563959 0.4826266
#>
#> $SE
#> [1] 0.005668176 0.030349725 0.037339967
#>
#> $ConfidenceInterval
#> 0.2 0.8 0.9
#> LogitCI_Lower 0.9634219 0.5947248 0.4103266
#> LogitCI_Upper 0.9864813 0.7132080 0.5556614
You can also specific covariates for multiple subpopualtions:
target_covariates = matrix(c(1, 0.7, 0.9,
1, 0.8, 0.8), 2, 3, byrow = TRUE)
sscaROC(diseaseData,controlData,
userFormula = userFormula,
control_sensitivity = c(0.2,0.8, 0.9),
target_covariates = target_covariates,
control_specificity = NULL,
mono_resp_method = "none",
whichSE = "sample",nbootstrap = 100,
CI_alpha = 0.95, logit_CI = TRUE)
#> $Estimate
#> $Estimate$`1_0.7_0.9`
#> [1] 0.9730656 0.6206365 0.4418030
#>
#> $Estimate$`1_0.8_0.8`
#> [1] 0.9730656 0.6206365 0.4418030
#>
#>
#> $SE
#> $SE$`1_0.7_0.9`
#> [1] 0.005700831 0.035914054 0.042258141
#>
#> $SE$`1_0.8_0.8`
#> [1] 0.005700831 0.035914054 0.042258141
#>
#>
#> $ConfidenceInterval
#> $ConfidenceInterval$`1_0.7_0.9`
#> 0.2 0.8 0.9
#> LogitCI_Lower 0.9593293 0.5481717 0.3613071
#> LogitCI_Upper 0.9822484 0.6880922 0.5254779
#>
#> $ConfidenceInterval$`1_0.8_0.8`
#> 0.2 0.8 0.9
#> LogitCI_Lower 0.9593293 0.5481717 0.3613071
#> LogitCI_Upper 0.9822484 0.6880922 0.5254779
Obtaining the covariate-adjusted ROC curve with sensitivity
controlled through the whole spectrum is very easy. You can choose
restoring monotonicity or no restoration when constructing ROC through
argument mono_resp_method
. It could be “none” (no
monotonicity restoration) or “ROC” (curve-based monotonicity
restoration).
### ROC with curve-based monotonicity restoration
curveROC <- caROC(diseaseData,controlData,userFormula,
mono_resp_method = "ROC",
verbose = FALSE)
Plot the ROC curves:
Construct confidence-band for the ROC curve:
curveROC_CB <- caROC_CB(diseaseData,controlData,
userFormula,
mono_resp_method = "ROC",
CB_alpha = 0.95,
nbin = 100,verbose = FALSE)
Plot the confidence band:
oldpar <- par()
par(mar = c(3, 3, 2, 0.3), mgp = c(1.2, 0.3, 0))
plot_caROC_CB(curveROC_CB, add = FALSE, lty = 2, col = "blue")
or plot the ROC and confidence band on the same plot:
The ROC curve for given subpopulation can be easily calculated:
target_covariates = c(1, 0.7, 0.9)
myROC <- sscaROC(diseaseData,
controlData,
userFormula,
target_covariates,
global_ROC_controlled_by = "sensitivity",
mono_resp_method = "none")
oldpar <- par()
par(mar = c(3, 3, 2, 0.3), mgp = c(1.2, 0.3, 0))
plot_sscaROC(myROC, lwd = 1.6)
Confidence band can also be computed, but may take ~10-20min for a dataset with 2000 samples.
myROCband <- sscaROC_CB(diseaseData,
controlData,
userFormula,
mono_resp_method = "none",
target_covariates,
global_ROC_controlled_by = "sensitivity",
CB_alpha = 0.95,
logit_CB = FALSE,
nbootstrap = 100,
nbin = 100,
verbose = FALSE)
oldpar <- par()
par(mar = c(3, 3, 2, 0.3), mgp = c(1.2, 0.3, 0))
plot_sscaROC_CB(myROCband, col = "purple", lty = 2)
par(oldpar)
In clinical setting, it is useful to know the specific thresholds of biomarkers at controlled sensitivity or specificity level for given covariate values.
### this is the given covariates of interest
new_covariates <- data.frame(M = 1,
Z1 = 0.7,
Z2 = 0.9)
### controlling sensitivity levels
caThreshold(userFormula, new_covariates,
diseaseData = diseaseData,
controlData = NULL,
control_sensitivity = c(0.7,0.8,0.9),
control_specificity = NULL)
#> control_sens=0.7 control_sens=0.8 control_sens=0.9
#> 1 1.774055 1.430411 1.032393
### controlling specificity levels
caThreshold(userFormula,new_covariates,
diseaseData = NULL,
controlData = controlData,
control_sensitivity = NULL,
control_specificity = c(0.7,0.8,0.9))
#> control_spec=0.7 control_spec=0.8 control_spec=0.9
#> 1 1.617172 1.925758 2.315431