Type: | Package |
Title: | The GiViTI Calibration Test and Belt |
Version: | 1.3 |
Date: | 2017-01-19 |
Description: | Functions to assess the calibration of logistic regression models with the GiViTI (Gruppo Italiano per la Valutazione degli interventi in Terapia Intensiva, Italian Group for the Evaluation of the Interventions in Intensive Care Units - see http://www.giviti.marionegri.it/) approach. The approach consists in a graphical tool, namely the GiViTI calibration belt, and in the associated statistical test. These tools can be used both to evaluate the internal calibration (i.e. the goodness of fit) and to assess the validity of an externally developed model. |
License: | GPL-3 |
LazyData: | TRUE |
Imports: | alabama, rootSolve, grDevices, graphics, stats |
Suggests: | testthat, knitr, rmarkdown |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2017-01-24 02:38:19 UTC; Giovanni |
Author: | Giovanni Nattino [cre, aut], Stefano Finazzi [aut], Guido Bertolini [aut], Carlotta Rossi [aut], Greta Carrara [aut] |
Maintainer: | Giovanni Nattino <giovanni.nattino@marionegri.it> |
Repository: | CRAN |
Date/Publication: | 2017-01-24 08:12:25 |
givitiR: assessing the calibration of binary outcome models with the GiViTI calibration belt.
Description
The package 'givitiR' provides the functions to plot the GiViTI calibration belt and to compute the associated statistical test.
Details
The name of the approach derives from the GiViTI (Gruppo Italiano per la valutazione degli interventi in Terapia Intensiva, Italian Group for the Evaluation of the Interventions in Intensive Care Units), an international network of intensive care units (ICU) established in Italy in 1992. The group counts more than 400 ICUs from 7 countries, with about the half of the participating centers continuosly collecting data on the admitted patients through the PROSAFE project (PROmoting patient SAFEty and quality improvement in critical care). For further information, see the package vignette and the references therein.
The GiViTI calibration belt has been developed within the methodological research promoted by the GiViTI network, with the purposes of a) enhancing the quality of the logistic regression models built in the group's projects b) providing the participating ICUs with a detailed feedback about their quality of care. A description of the approach and examples of applications are reported in the package vignette.
The main functions of the package are listed below.
Fitting the calibration belt
givitiCalibrationBelt
implements the computations necessary
to plot the calibration belt.
Plotting the calibration belt
plot.givitiCalibrationBelt
plots the calibration belt.
Computing the calibration test
givitiCalibrationTest
performs the calibration test associated to the
calibration belt.
Calibration Belt Significant Deviations
Description
calibrationBeltIntersections
returns the
intervals where the calibration belt significantly deviates
from the bisector.
Usage
calibrationBeltIntersections(cbBound, seqP, minMax)
Arguments
cbBound |
A |
seqP |
The vector of the the probabilities where the points of the calibration belt have been evaluated. |
minMax |
A list with two elements, named |
Value
A list with two components, overBisector
and underBisector
.
Each component is a list containing all the intervals where the calibration
belt is significantly over/under the bisector.
See Also
givitiCalibrationBelt
and plot.givitiCalibrationBelt
to compute and plot the calibaration belt, and
givitiCalibrationTest
to perform the
associated calibration test.
Examples
e <- runif(1000)
logite <- logit(e)
eMod <- logistic(logit(e) + (logit(e))^2)
o <- rbinom(1000, size = 1, prob = eMod)
data <- data.frame(e = e, o = o, logite = logite)
seqP <- seq(from = .01, to =.99, by = .01)
seqG <- logit(seqP)
minMax <- list(min = min(e), max = max(e))
fwLR <- polynomialLogRegrFw(data, .95, 4, 1)
cbBound <- calibrationBeltPoints(data, seqG, fwLR$m, fwLR$fit, .95, .90, "external")
calibrationBeltIntersections(cbBound, seqP, minMax)
Calibration Belt Confidence Region
Description
calibrationBeltPoints
computes the points defining the boundary
of the confidence region.
Usage
calibrationBeltPoints(data, seqG, m, fit, thres, cLevel, devel)
Arguments
data |
A |
seqG |
A vector containing the logit of the probabilities where the points of the calibration belt will be evaluated. |
m |
A scalar integer representing the degree of the polynomial at the end of the forward selection. |
fit |
An object of class |
thres |
A numeric scalar between 0 and 1 representing 1 - the significance level adopted in the forward selection. |
cLevel |
A numeric scalar between 0 and 1 representing the confidence level that will be used for the confidence region. |
devel |
A character string specifying if the model has been fit on
the same dataset under evaluation ( |
Value
A data.frame
object with two columns, "U" and "L", containing
the points of the upper and lower boundary of the cLevel
*100%-level calibration belt evaluated
at values seqG
.
See Also
givitiCalibrationBelt
and plot.givitiCalibrationBelt
to compute and plot the calibaration belt, and
givitiCalibrationTest
to perform the
associated calibration test.
Examples
e <- runif(100)
logite <- logit(e)
o <- rbinom(100, size = 1, prob = e)
data <- data.frame(e = e, o = o, logite = logite)
seqG <- logit(seq(from = .01, to =.99, by = .01))
fwLR <- polynomialLogRegrFw(data, .95, 4, 1)
calibrationBeltPoints(data, seqG, fwLR$m, fwLR$fit, .95, .90, "external")
Calibration Belt
Description
givitiCalibrationBelt
implements the computations necessary
to plot the calibration belt.
Usage
givitiCalibrationBelt(o, e, devel, subset = NULL, confLevels = c(0.8, 0.95),
thres = 0.95, maxDeg = 4, nPoints = 200)
Arguments
o |
A numeric vector representing the binary outcomes.
The elements must assume only the values 0 or 1. The predictions
in |
e |
A numeric vector containing the predictions of the
model under evaluation. The elements must be numeric and between 0 and 1.
The lenght of the vector must be equal to the length of the vector |
devel |
A character string specifying if the model has been fit on
the same dataset under evaluation ( |
subset |
An optional boolean vector specifying the subset of observations to be considered. |
confLevels |
A numeric vector containing the confidence levels of the calibration belt. The default values are set to .80 and .95. |
thres |
A numeric scalar between 0 and 1 representing 1 - the significance level adopted in the forward selection. By default is set to 0.95. |
maxDeg |
The maximum degree considered in the forward selection. By default is set to 4. |
nPoints |
A numeric scalar indicating the number of points to be considered to plot the calibration belt. The default value is 200. |
Details
The calibration belt and the associated test can be used both to evaluate the calibration of the model in external samples or in the development dataset. However, the two cases have different requirements. When a model is evaluated on independent samples, the calibration belt and the related test can be applied whatever is the method used to fit the model. Conversely, they can be used on the development set only if the model is fitted with logistic regression.
Value
An object of class givitiCalibrationBelt
.
After computing the calibration belt with the present function,
the plot
method can be used to plot
the calibration belt. The object returned is a list that contains the
following components:
- n
The size of the sample evaluated in the analysis, after discarding missing values from the vectors
o
ande
.- resultCheck
Result of the check on the data. If the data are compatible with the construction of the calibration belt, the value is the boolean
TRUE
. Otherwise, the element contain a character string describing the problem found.- m
The degree of the polynomial at the end of the forward selection.
- statistic
The value of the test's statistic.
- p.value
The p-value of the test.
- seqP
The vector of the probabilities where the points of the calibration belt has been evaluated.
- minMax
A list with two elements named
min
andmax
representing the minimum and maximum probabilities in the model under evaluation- confLevels
The vector containing the confidence levels of the calibration belt.
- intersByConfLevel
A list whose elements report the intervals where the calibration belt is significantly over/under the bisector for each confidence level in
confLevels
.
See Also
plot.givitiCalibrationBelt
to plot the calibaration belt and
givitiCalibrationTest
to perform the
associated calibration test.
Examples
#Random by-construction well calibrated model
e <- runif(100)
o <- rbinom(100, size = 1, prob = e)
cb <- givitiCalibrationBelt(o, e, "external")
plot(cb)
#Random by-construction poorly calibrated model
e <- runif(100)
o <- rbinom(100, size = 1, prob = logistic(logit(e)+2))
cb <- givitiCalibrationBelt(o, e, "external")
plot(cb)
Table of the Calibration Belt Significant Deviations
Description
givitiCalibrationBeltTable
prints on the graphical area of the calibration
belt plot the table that summarizes the significant deviations from the
line of perfect calibration (i.e. the bisector of the I quadrant).
Usage
givitiCalibrationBeltTable(cb, tableStrings, grayLevels, xlim, ylim)
Arguments
cb |
A |
tableStrings |
Optional. A list with four character elements named
|
grayLevels |
A vector containing the code of the gray levels used in the plot of the calibration belt. |
xlim , ylim |
Numeric vectors of length 2, giving the
x and y coordinates ranges. Default values are |
Value
The function prints the table on the graphical area.
Calibration Test
Description
givitiCalibrationTest
performs the calibration test associated to the
calibration belt.
Usage
givitiCalibrationTest(o, e, devel, subset = NULL, thres = 0.95,
maxDeg = 4)
Arguments
o |
A numeric vector representing the binary outcomes.
The elements must assume only the values 0 or 1. The predictions
in |
e |
A numeric vector containing the probabilities of the
model under evaluation. The elements must be numeric and between 0 and 1.
The lenght of the vector must be equal to the length of the vector |
devel |
A character string specifying if the model has been fit on
the same dataset under evaluation ( |
subset |
An optional boolean vector specifying the subset of observations to be considered. |
thres |
A numeric scalar between 0 and 1 representing 1 - the significance level adopted in the forward selection. By default is set to 0.95. |
maxDeg |
The maximum degree considered in the forward selection. By default is set to 4. |
Details
The calibration belt and the associated test can be used both to evaluate the calibration of the model in external samples or in the development dataset. However, the two cases have different requirements. When a model is evaluated on independent samples, the calibration belt and the related test can be applied whatever is the method used to fit the model. Conversely, they can be used on the development set only if the model is fitted with logistic regression.
Value
A list of class htest
containing the following components:
- statistic
The value of the test's statistic.
- p.value
The p-value of the test.
- null.value
The vector of coefficients hypothesized under the null hypothesis, that is, the parameters corresponding to the bisector.
- alternative
A character string describing the alternative hypothesis.
- method
A character string indicating what type of calibration test (internal or external) was performed.
- estimate
The estimate of the coefficients of the polynomial logistic regression.
- data.name
A character string giving the name(s) of the data.
See Also
givitiCalibrationBelt
and plot.givitiCalibrationBelt
to compute and plot the calibaration belt.
Examples
#Random by-construction well calibrated model
e <- runif(100)
o <- rbinom(100, size = 1, prob = e)
givitiCalibrationTest(o, e, "external")
#Random by-construction poorly calibrated model
e <- runif(100)
o <- rbinom(100, size = 1, prob = logistic(logit(e)+2))
givitiCalibrationTest(o, e, "external")
Computation of the Calibration Test
Description
givitiCalibrationTestComp
implements the computations necessary to
perform the calibration test associated to the calibration belt.
Usage
givitiCalibrationTestComp(o, e, devel, thres, maxDeg)
Arguments
o |
A numeric vector representing the binary outcomes.
The elements must assume only the values 0 or 1. The predictions
in |
e |
A numeric vector containing the probabilities of the
model under evaluation. The elements must be numeric and between 0 and 1.
The lenght of the vector must be equal to the length of the vector |
devel |
A character string specifying if the model has been fit on
the same dataset under evaluation ( |
thres |
A numeric scalar between 0 and 1 representing 1 - the significance level adopted in the forward selection. |
maxDeg |
The maximum degree considered in the forward selection. |
Details
The calibration belt and the associated test can be used both to evaluate the calibration of the model in external samples or in the development dataset. However, the two cases have different requirements. When a model is evaluated on independent samples, the calibration belt and the related test can be applied whatever is the method used to fit the model. Conversely, they can be used on the development set only if the model is fitted with logistic regression.
Value
A list containing the following components:
- data
A
data.frame
object with the numeric variables "o", "e" provided in the input and the variable "logite", the logit of the probabilities.- nrowOrigData
The size of the original sample, i.e. the length of the vectors
e
ando
.- calibrationStat
The value of the test's statistic.
- calibrationP
The p-value of the test.
- m
The degree of the polynomial at the end of the forward selection.
- fit
An object of class
glm
containig the output of the fit of the logistic regression model at the end of the iterative forward selection.
See Also
givitiCalibrationBelt
and plot.givitiCalibrationBelt
to compute and plot the calibaration belt, and
givitiCalibrationTest
to perform the
associated calibration test.
Examples
e <- runif(100)
o <- rbinom(100, size = 1, prob = e)
givitiCalibrationTestComp(o, e, "external", .95, 4)
Check of the argument's values
Description
Check of the coherence of the values passed to the functions
givitiCalibrationTest
and givitiCalibrationBelt
.
Usage
givitiCheckArgs(o, e, devel, thres, maxDeg)
Arguments
o |
A numeric vector representing the binary outcomes.
The elements must assume only the values 0 or 1. The predictions
in |
e |
A numeric vector containing the probabilities of the
model under evaluation. The elements must be numeric and between 0 and 1.
The lenght of the vector must be equal to the length of the vector |
devel |
A character string specifying if the model has been fit on
the same dataset under evaluation ( |
thres |
A numeric scalar between 0 and 1 representing 1 - the significance level adopted in the forward selection. |
maxDeg |
The maximum degree considered in the forward selection. |
Value
The function produce an error if the elements provided through the arguments do not meet the constraints reported.
Check of data
Description
The function verifies that the data are compatible with the construction of the calibration belt. In particular, the function checks that the predictions provided do not complete separate the outcomes and that at least two events and non-events are present in the data.
Usage
givitiCheckData(o, e)
Arguments
o |
A numeric vector representing the binary outcomes.
The elements must assume only the values 0 or 1. The predictions
in |
e |
A numeric vector containing the probabilities of the
model under evaluation. The elements must be numeric and between 0 and 1.
The lenght of the vector must be equal to the length of the vector |
Value
The output is TRUE
if the data do not show any of the
reported problems. Otherwise, the function returns a string describing the
problem found.
CDF of the Calibration Statistic Under the Null Hypothesis
Description
givitiStatCdf
returns the cumulative density function of the
calibration statistic under the null hypothesis.
Usage
givitiStatCdf(t, m, devel, thres)
Arguments
t |
The argument of the CDF. Must be a scalar value. |
m |
The scalar integer representing the degree of the polynomial at the end of the forward selection. |
devel |
A character string specifying if the model has been fit on
the same dataset under evaluation ( |
thres |
A numeric scalar between 0 and 1 representing the significance level adopted in the forward selection. |
Value
A number representing the value of the CDF evaluated in t.
See Also
givitiCalibrationBelt
and plot.givitiCalibrationBelt
to compute and plot the calibaration belt, and
givitiCalibrationTest
to perform the
associated calibration test.
Examples
givitiStatCdf(3, 1, "external", .95)
givitiStatCdf(3, 2, "internal", .95)
Information of SAPS II score and outcome of 1,000 ICU patients.
Description
A dataset containing clinical information of 1,000 patients admitted to Italian Intesive Care Units joining the GiViTI network (Gruppo Italiano per la valutazione degli interventi in Terapia Intensiva, Italian Group for the Evaluation of the Interventions in Intensive Care Units). The data has been collected within the ProSAFE project, an Italian observational study based on a continuous data collection of clinical data in more than 200 Italian ICUs. The purpose of the project is a continuous surveillance of the quality of care provided in the participating centres. The actual values of the variables have been modified to protect subject confidentiality.
Usage
icuData
Format
A data frame with 1000 rows and 33 variables. The dataset contains, for each predictor of the SAPSII score, both the clinical information and the weight of that variable in the score (the variable with the suffix '_NUM').
- outcome
hospital outcome, numeric binary variable with values 1 (deceased) and 0 (alive).
- probSaps
probability estimated by the SAPSII prognostic model.
- sapsScore
SAPSII score.
- age,age_NUM
age, factor variable with levels (in years): '<40', '40-59', '60-69', '70-74', '75-80', '>=80'.
- adm,adm_NUM
type of admission, factor variable with 3 levels: 'unschSurg' (unscheduled surgery), 'med' (medical), 'schSurg' (scheduled surgery).
- chronic,chronic_NUM
chronic diseases, factor variable with 4 levels: 'noChronDis' (no chronic disease), 'metCarc' (metastatic carcinoma), 'hemMalig' (hematologic malignancy), 'aids' (AIDS).
- gcs,gcs_NUM
Glasgow Coma Scale, factor variable with 5 levels: '3-5', '6-8', '9-10', '11-13', '14-15'.
- BP,BP_NUM
systolic blood pressure, factor variable with 4 levels (in mmHg): '<70', '70-99', '100-199', '>=200'.
- HR,HR_NUM
heart rate, factor variable with 5 levels: '<40', '40-69', '70-119', '120-159', '>=160'
- temp,temp_NUM
temperature, factor variable with 2 levels (in Celsius degree): '<39', '>=39'.
- urine,urine_NUM
urine output, factor variable with 3 levels (in L/24h): '<0.5', '0.5-0.99', '>=1'.
- urea,urea_NUM
serum urea, factor variable with 3 levels (in g/L): '<0.60', '0.60-1.79', '>=1.80'.
- WBC,WBC_NUM
wbc, factor variable with 3 levels (in 1/mm3): '<1', '1-19', '>=20'.
- potassium,potassium_NUM
potassium, factor variable with 3 levels (in mEq/L): '<3', '3-4.9', '>=5'.
- sodium,sodium_NUM
sodium, factor variable with 3 levels (in mEq/L): '<125', '125-144', '>=145'.
- HCO3,HCO3_NUM
HCO3, factor variable with 3 levels (in mEq/L): '<15', '15-19', '>=20'.
- bili,bili_NUM
bilirubin, factor variable with 3 levels (in mg/dL): '<4', '4-5.9', '>=6'.
- paFiIfVent,paFiIfVent_NUM
mechanical ventilation and CPAP PaO2/FIO2, factor variable with 4 levels (PaO2/FIO2 in mmHg): 'noVent' (not ventilated), 'vent_<100' (ventialated and Pa02/FI02 <100), 'vent_100-199' (ventialated and Pa02/FI02 in 100-199), 'vent_>=200' (ventialated and Pa02/FI02 >= 200).
Details
The data contain the information to apply the SAPSII model, a prognostic model developed to predict hospital mortality (Le Gall et al., 1993). Both the computed SAPSII score and the associated probability of death are variables of the dataset. The score is an integer number ranging from 0 to 163 describing the severity of the patient (the higher the score, the more severe the patient). The probability is computed from the score through the formula reported in the original paper. The dataset contains also the hospital survival of the patients.
Source
http://www.giviti.marionegri.it/Default.asp (in Italian only)
References
Le Gall, Jean-Roger, Stanley Lemeshow, and Fabienne Saulnier. "A new simplified acute physiology score (SAPS II) based on a European/North American multicenter study." Jama 270, no. 24 (1993): 2957-2963.
The GiViTI Network, Prosafe Project - 2014 report. Sestante Edizioni: Bergamo, 2015. http://www.giviti.marionegri.it/Download/ReportPROSAFE_2014_EN_Polivalenti_ITALIA.pdf.
Logit and logistic functions
Description
logit
and logistic
implement the logit and logistic transformations, respectively.
Usage
logit(p)
logistic(x)
Arguments
p |
A numeric vector whose components are numbers between 0 and 1. |
x |
A numeric vector. |
Value
The functions apply the logit and logistic transformation to each element of the vector passed as argument. In particular, logit(p)=ln(p/(1-p)) and logistic(x)=exp(x)/(1+exp(x)).
Examples
logit(0.1)
logit(0.5)
logistic(0)
logistic(logit(0.25))
logit(logistic(2))
Calibration Belt Plot
Description
The plot
method for calibration belt objects.
Usage
## S3 method for class 'givitiCalibrationBelt'
plot(x, xlim = c(0, 1), ylim = c(0, 1),
colBis = "red", xlab = "e", ylab = "o",
main = "GiViTI Calibration Belt", polynomialString = T,
pvalueString = T, nString = T, table = T, tableStrings = NULL,
unableToFitString = NULL, ...)
Arguments
x |
A |
xlim , ylim |
Numeric vectors of length 2, giving the
x and y coordinates ranges. Default values are |
colBis |
The color to be used for the bisector. The default value is red. |
xlab , ylab |
Titles for the x and y axis. Default values are "e" and "o", repectively. |
main |
The main title of the plot. The default value is "GiViTI Calibration Belt". |
polynomialString |
If the value is FALSE, the degree of the polynomial is not printed on the graphical area. If the value is TRUE, the degree m is reported. If a string is passed to this argument, the string is reported instead of the text "Polynomial degree". The default value is TRUE. |
pvalueString |
If the value is FALSE, the p-value of the test is not printed on the graphical area. If the value is TRUE, the p-value is reported. If a string is passed to this argument, the string is reported instead of the text "p-value". The default value is TRUE. |
nString |
If the value is FALSE, the sample size is not printed on the graphical area. If the value is TRUE, the sample size is reported. If a string is passed to this argument, the string is reported instead of the text "n". The default value is TRUE. |
table |
A boolean value indicating whether the table reporting the intersections of the calibration belt with the bisector should be printed on the plot. |
tableStrings |
Optional. A list with four character elements named
|
unableToFitString |
Optional. If a string is passed to this argument, this string is reported in the plot area when the dataset is not compatible with the fit of the calibration belt (e.g. data separation or no positive events). By default, in such cases the text "Unable to fit the Calibration Belt" is reported. |
... |
Other graphical parameters passed to the generic |
Value
The function generates the calibration belt plot. In addition, a list containing the following components is returned:
- p.value
The p-value of the test.
- m
The degree of the polynomial at the end of the forward selection.
See Also
givitiCalibrationBelt
to compute the calibaration belt and
givitiCalibrationTest
to perform the
associated calibration test.
Examples
#Random by-construction well calibrated model
e <- runif(100)
o <- rbinom(100, size = 1, prob = e)
cb <- givitiCalibrationBelt(o, e, "external")
plot(cb)
#Random by-construction poorly calibrated model
e <- runif(100)
o <- rbinom(100, size = 1, prob = logistic(logit(e)+2))
cb <- givitiCalibrationBelt(o, e, "external")
plot(cb)
Forward Selection in Polynomial Logistic Regression
Description
polynomialLogRegrFw
implements a forward selection in a
polynomial logistic regression model.
Usage
polynomialLogRegrFw(data, thres, maxDeg, startDeg)
Arguments
data |
A |
thres |
A numeric scalar between 0 and 1 representing the significance level adopted in the forward selection. |
maxDeg |
The maximum degree considered in the forward selection. |
startDeg |
The starting degree in the forward selection. |
Value
A list containing the following components:
- fit
An object of class
glm
containig the output of the fit of the logistic regression model at the end of the iterative forward selection.- m
The degree of the polynomial at the end of the forward selection.
Examples
e <- runif(100)
logite <- logit(e)
o <- rbinom(100, size = 1, prob = e)
data <- data.frame(e = e, o = o, logite = logite)
polynomialLogRegrFw(data, .95, 4, 1)