[R] Summarizing data containing data/time information (as factor)

David Winsemius dwinsemius at comcast.net
Thu Sep 6 04:48:59 CEST 2012


On Sep 5, 2012, at 4:57 PM, HJ YAN wrote:

> Dear R user
> 
> I want to create a table (as below) to summarize the attached data
> (Test.csv, which can be read into R by using 'read.csv(Test.csv, header=F)'
> ),

Unfortunately you did not read theinformation about posting attachments carefully enough and you csv filewas scrubbed by the mailserver.. I will attempt to recreate it:
> dat <- read.table(text="             28/04     29/04    30/04    01/05   02/05
+ 532703     0              1         1           1        0
+ 532704     1              1         1           1        1
+ 532705     0              0         1           1        0", header=TRUE)
> sdat <- stack(dat)
> sdat$id <- rownames(dat)
> sdat2 <- sdat[sdat$values>0, c(3,2,1)]
> sdat2$ind <- factor( sub("\\.", "/", sdat2$ind))
> sdat2$ind <- factor( sub("X", "", sdat2$ind))

At this point I think I have something similar to your original and will show how xtabs() can be used:

> xtabs(values ~id+ind, data=sdat2)
        ind
id       01/05 02/05 28/04 29/04 30/04
  532703     1     0     0     1     1
  532704     1     1     1     1     1
  532705     1     0     0     0     1

The dates are not sorted the same but if you used Date formatted values, they might.

> sdat2$dt <- as.Date( sdat2$ind, format="%d/%m")
> xtabs(values ~id+dt, data=sdat2)
        dt
id       2012-04-28 2012-04-29 2012-04-30 2012-05-01 2012-05-02
  532703          0          1          1          1          0
  532704          1          1          1          1          1
  532705          0          0          1          1          0


> to indicate the day that there are any data available, e.g.value=1 if
> there are any data available for that day, otherwise value=0.
> 
> 
>              28/04     29/04    30/04    01/05   02/05
> 532703     0              1         1           1        0
> 532704     1              1         1           1        1
> 532705     0              0         1           1        0
> 
> Only Column A (Names: automatically stored as integer if being read into R)
> and Column B (date/time: automatically stored as factor if being read into
> R) are useful for this task.
> 
> Could anyone kindly provide me some hints/ideas about how to write some
> code to to this job please?
> 
> 
> Many thanks in advance!
> 
> Best wishes
> HJ
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA




More information about the R-help mailing list