[R] Use of apply rather than a loop

William Dunlap wdunlap at tibco.com
Sun Dec 6 21:53:58 CET 2009


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of William Dunlap
> Sent: Friday, December 04, 2009 3:29 PM
> To: Dennis Fisher; r-help at stat.math.ethz.ch
> Subject: Re: [R] Use of apply rather than a loop
> 
> You could try using merge:
> 
>    >
> d<-data.frame(Subject=rep(11:13,each=3),Time=101:109,Marker=c(0,1,0,
> 0,0,0, 0,0,1))
>    > d
>      Subject Time Marker
>    1      11  101      0
>    2      11  102      1
>    3      11  103      0
>    4      12  104      0
>    5      12  105      0
>    6      12  106      0
>    7      13  107      0
>    8      13  108      0
>    9      13  109      1
>    > d$Time - merge(d,d[d$Marker==1,],by="Subject",all.x=TRUE)$Time.y
>    [1] -1  0  1 NA NA NA -2 -1  0

merge() can rearrange the rows of the data.frame so I should
have diff'ed 2 columns from the merged data.frame, not one
from the original and one from the merged.  E.g.,
   > d <- d[sample(nrow(d)),] # scramble rows
   > dm <- merge(d,d[d$Marker==1,],by="Subject",all.x=TRUE,
suffixes=c("", "Marked"))
   > dm$TimeMinusMarkedTime <- dm$Time - NAtoZero(dm$TimeMarked)
   > dm <- dm[ with(dm, order(Subject, Time)), ] # put into nice order
   > dm$TimeMinusMarkedTime
   [1] -1  0  1 NA NA NA -2 -1  0
   > dm
     Subject Time Marker TimeMarked MarkerMarked TimeMinusMarkedTime
   3      11  101      0        102            1                  -1
   1      11  102      1        102            1                   0
   2      11  103      0        102            1                   1
   5      12  104      0         NA           NA                 104
   4      12  105      0         NA           NA                 105
   6      12  106      0         NA           NA                 106
   7      13  107      0        109            1                  -2
   9      13  108      0        109            1                  -1
   8      13  109      1        109            1                   0
When there are lots of unique values in Subject this
is considerably faster than any sort of looping approach.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> 
> If you want the reference times for Subjects without a
> marked instance to be 0, replace the NA's in Time.y by 0:
>    > NAtoZero<-function(x){ x[is.na(x)]<-0 ; x }
>    > d$Time -
> NAtoZero(merge(d,d[d$Marker==1,],by="Subject",all.x=TRUE)$Time.y)
>    [1]  -1   0   1 104 105 106  -2  -1   0
> 
> If there is more than one marked instance for
> a given subject this method will fail.
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com  
> 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org 
> > [mailto:r-help-bounces at r-project.org] On Behalf Of Dennis Fisher
> > Sent: Friday, December 04, 2009 2:47 PM
> > To: r-help at stat.math.ethz.ch
> > Subject: [R] Use of apply rather than a loop
> > 
> > Colleagues,
> > 
> > R 2.9.0 on all platforms
> > 
> > I have a dataset that contains three columns of interest: 
> > ID's, serial  
> > elapsed times, and a marker.  Representative data:
> > Subject		Time		Marker
> > 1			100.5		0
> > 1			101			0
> > 1			102			1
> > 1			103			0
> > 1			105			0
> > 
> > For each subject, I would like to find the time associated 
> > with MARKER  
> > == 1, then replace Time with Time - (Time[Marker == 1])
> > The result for this subject would be:
> > Subject		Time		Marker
> > 1			-1.5			0
> > 1			-1			0
> > 1			0			1
> > 1			1			0
> > 1			3			0
> > 
> > One proviso: some subjects do not have Marker == 1; for these  
> > subjects, I would like Time to remain unchanged.
> > 
> > At present, I am looping over each subject.  The number of 
> > subjects is  
> > large so this process is quite slow.  I assume that one of 
> the apply  
> > functions could speed this markedly but I am not facile 
> with them.   
> > Any help would be appreciated.
> > 
> > Dennis
> > 
> > Dennis Fisher MD
> > P < (The "P Less Than" Company)
> > Phone: 1-866-PLessThan (1-866-753-7784)
> > Fax: 1-866-PLessThan (1-866-753-7784)
> > www.PLessThan.com
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 




More information about the R-help mailing list