[R] Split rows depending on time frame

ONKELINX, Thierry Thierry.ONKELINX at inbo.be
Mon Oct 11 11:35:19 CEST 2010


Dear Bert,

Use the plyr package to do the magic

library(plyr)
dataset <- data.frame(COL1 = c("A", "B"), COL2 = 40462, COL3 = c(40482,
40478))

tmp <- ddply(dataset, "COL1", function(x){
	delta <- with(x, 1 + COL3 - COL2)
	rows <- rep(1, delta %/% 7)
	if(delta %% 7 > 0){
		rows <- c(rows, (delta %% 7) / 7)
	}
	data.frame(COL4 = rows)
})
merge(dataset, tmp)

HTH,

Thierry
------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
Thierry.Onkelinx op inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
  

> -----Oorspronkelijk bericht-----
> Van: r-help-bounces op r-project.org 
> [mailto:r-help-bounces op r-project.org] Namens Bert Jacobs
> Verzonden: maandag 11 oktober 2010 11:26
> Aan: r-help op r-project.org
> Onderwerp: [R] Split rows depending on time frame
> 
> Hi,
> 
>  
> 
> I have the following data frame, where col2 is a startdate 
> and col3 an enddate
> 
>  
> 
> COL1      COL2      COL3
> 
> A             40462    40482
> 
> B             40462    40478
> 
>  
> 
> The above timeframe of 3 weeks I would like to splits it in 
> weeks like this
> 
> COL1      COL2      COL3      COL4
> 
> A             40462    40468    1
> 
> A             40469    40475    1
> 
> A             40476    40482    1
> 
> B             40462    40468    1
> 
> B             40469    40475    1
> 
> B             40476    40478    0.428
> 
>  
> 
> Where COL4 is an identifier if the timeframe between COL2 and 
> COL3 is exactly 7 days or shorter. 
> 
> In the example above for B the last split contains only 3 
> days so the value in COL 4 is 3/7
> 
>  
> 
> I can't figure out to do the above. Is there someone who can 
> help me out? 
> 
>  
> 
> Thx in advance,
> 
> Bert
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list