[R] Locate Patients who have multiple high blood pressure readings

William Dunlap wdunlap at tibco.com
Thu Jan 31 19:24:41 CET 2013


> dd <-  # from dput() 
structure(list(ColA = c(92L, 94L, 174L, 176L, 180L, 268L, 268L
), PT_ID = c(1900L, 1900L, 2900L, 2900L, 3900L, 3900L, 3900L),
    Blood_Pressure = c(90, 90, 140, 130, 120, 150, 90), OBS_TYPE = structure(c(1L,
    1L, 2L, 2L, 2L, 2L, 1L), .Label = c("DBP", "SBP"), class = "factor")), .Names = c("ColA",
"PT_ID", "Blood_Pressure", "OBS_TYPE"), class = "data.frame", row.names = c(NA,
-7L))
> library(plyr)
> ddply(dd, .(PT_ID), summarize, Include=sum(OBS_TYPE=="DBP" & Blood_Pressure>=90)>=2 || sum(OBS_TYPE=="SBP" & Blood_Pressure>=140)>=2)
  PT_ID Include
1  1900    TRUE
2  2900   FALSE
3  3900   FALSE

sum(logicalVector) tells how many TRUE's are in logicalVector.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Bert Gunter
> Sent: Thursday, January 31, 2013 9:52 AM
> To: Weijia Wang
> Cc: r-help at r-project.org
> Subject: Re: [R] Locate Patients who have multiple high blood pressure readings
> 
> Well, since no one has responded....
> 
> 
> Please use ?dput to provide data in your posts.
> 
>  There are likely  zillions of way to go about this. Following is one
> way based on ?duplicated that I think works, but I make no claims for
> either elegance or efficiency. Others may do lots better. But maybe it
> suffices.
> 
> 
> ## Untested
> ## I assume the data is provided in a data frame named dd.
> 
> ## All PT_ID's with >=1 high readings in SBP or in DBP
> > hiS <- with(dd,PT_ID[OBS_TYPE == "SBP" & Blood_Pressure >= 140])
> > hiD <-  with(dd,PT_ID[OBS_TYPE == "DBP" & Blood_Pressure > =90])
> 
> ## id's that appear more than once in either
> > union(unique(hiS[duplicated(hiS)]), unique(hiD[duplicated(hiD)])
> 
> ## you can subset your data frame to match just these,  e.g. via
> %in%, if you like.
> 
> 
> Cheers,
> Bert
> 
> 
> 
> 
> On Thu, Jan 31, 2013 at 7:51 AM, Weijia Wang <wwang.nyu at gmail.com> wrote:
> > On Thu, Jan 31, 2013 at 10:29 AM, Weijia Wang <wwang.nyu at gmail.com> wrote:
> >
> >> Hi,
> >>
> >>
> >>
> >> I have a new question about subsetting in R.
> >>
> >>
> >>
> >> Say we have this data frame:
> >>
> >>
> >>
> >>     PT_ID Blood_Pressure OBS_TYPE
> >>
> >> 92   1900      90.0      DBP
> >>
> >> 94   1900      90.0      DBP
> >>
> >> 174  2900     140.0      SBP
> >>
> >> 176  2900     130.0      SBP
> >>
> >> 180  3900     120.0      SBP
> >>
> >> 268  3900     150.0      SBP
> >>
> >> 268  3900      90.0      DBP
> >>
> >>
> >>
> >> I need to obtain those with 2+ DBP>=90 or 2+ SBP>=140.
> >>
> >>
> >>
> >> PT_ID=1900, he has 2 DBP>=90, so he will be included.
> >>
> >> PT_ID=2900, he has 1 SBP>=140, so he will NOT be included.
> >>
> >> PT_ID=3900, he has 1 SBP>=140 and 1 DBP>=90, so he will still NOT be
> >> included.
> >>
> >>
> >>
> >> So, the condition requires TWO OR MORE values higher than the threshold.
> >> It could be either SBP or DBP or both of them.
> >>
> >>
> >>
> >> I have tried ddply, but I don’t know how to add the condition 2+ inside
> >> ddply.
> >>
> >>
> >>
> >> Any help is appreciated!!
> >>
> >>
> >>
> >> Weijia
> >>
> >>
> >>
> >
> >         [[alternative HTML version deleted]]
> >
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> 
> --
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> 
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
> biostatistics/pdb-ncb-home.htm
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list