[R] Dx accuracy measures from raw data

Sarah Goslee sarah.goslee at gmail.com
Wed Jul 23 17:05:31 CEST 2014


I'm pretty sure you posted this exact question already. Usually if you
don't get any replies, it means your question is badly-formed. Posting
the same thing won't help any. Posting a revised version might.

Your example data is not really helpful to someone trying to suggest R
code, since it's all Yes or 1 values. Using dput() is also a LOT more
useful than trying to copy and paste (especially since you posted HTML
to the list, which tends to get mangled).

Have you looked into any of the many R packages that calculate the
statistics you're interested in, to learn how they expect their data
to be formatted? That would be much more useful than you trying to
reinvent the wheel.


is a good place to search for R-related information, including
packages functions to do particular things. A search there lists many
options for doing what you want. Your first step should probably be to
investigate them.

Once you've done that, a clear R question posted to the list is far
more likely to receive useful answers. Here's another link that might
be of use to you:


Also, please keep in mind that this list is very general, and most of
us do not share your subject domain knowledge. The more explicit you
can be about what you want and how, the more useful and abundant the
replies are likely to be.


On Wed, Jul 23, 2014 at 10:47 AM, Anoop Shah <anoopsshah at gmail.com> wrote:
> Hello R users!
> I am a medic and have been working with R for about 6 months now.
> I was hoping to pick someone’s brain about a diagnostic accuracy study that has now been completed.
> I am trying to derive the sensitivity, specificity, NPV and PPV with the corresponding 95% CI from the raw data.
> My data is in a data frame as below
> g.s     t1      t2      t3      t3      t4      t5      index
> Yes     1       1       1       1       1       1       1
> Yes     1       1       1       1       1       1       2
> Yes     1       1       1       1       1       1       3
> Yes     1       1       1       1       1       1       4
> Yes     1       1       1       1       1       1       5
> Each row represents a patient with a unique id (variable: index).
> g.s is a binary variable ans represents the results from the gold standard (yes / no).
> t1 to t5 are the tests at different thresholds being tested.
> t1 to t5 are all binary variables with 1 as yes and 0 as no.
> Now i could create separate 2 x 2 tables for each threshold (t1 to t5) against the gold standard and subsequently derive sense, spec, NPV and PPV plus their 95 % CI for each threshold (t1 to t5).
> I was however wondering if there was a more efficient way to get these results from the raw data in R.
> Hope I have explained my self clearly and thanks a lot in advance!!
> Cheers
> Anoop
> Dr Anoop Shah
> Cardiology Research fellow
> Centre of Cardiovascular sciences
> Chancellors Building
> Room SU 305
> University Of Edinburgh
> Little France
> Edinburgh
> EH16 4SB
> Tel: +447766544156

Sarah Goslee

More information about the R-help mailing list