[R] arrange data

Rui Barradas ruipbarradas at sapo.pt
Sat Oct 6 13:22:14 CEST 2012


Hello,

Using Arun's data example, instead of creating a factor "convert" to 4 
digits years.

set.seed(1)
dat1 <- data.frame(Tahun=rep(c(98:99,00),each=36),
             Bahun=rep(rep(1:12,times=3),each=3),
             x=sample(1:500,108,replace=TRUE))

dat2 <- dat1  # operate on a copy
dat2$Tahun <- with(dat2, ifelse(Tahun < 71, 2000 + Tahun, 1900 + Tahun))

agg_dt1 <- aggregate(x=dat2[,3],by=dat2[,c(1,2)],FUN=sum)
head(agg_dt1)

Hope this helps,

Rui Barradas
Em 06-10-2012 03:38, arun escreveu:
> Hi,
>
> I hope this helps you.
>   I created a small dataset: 3 replications per month for 1998:2000.
>
> set.seed(1)
> dat1<-data.frame(Tahun=rep(c(98:99,00),each=36),Bahun=rep(rep(1:12,times=3),each=3), x=sample(1:500,108,replace=TRUE))
> dat2<-within(dat1,{Tahun<-factor(Tahun,levels=c(98,99,0))})
>
>
> agg_dt1<-aggregate(x=dat2[,3],by=dat2[,c(1,2)],FUN=sum)
>   head(agg_dt1)
> #  Tahun Bahun    x
> #1    98     1 1252
> #2    99     1  680
> #3     0     1  687
> #4    98     2  761
> #5    99     2  860
> #6     0     2  786
> I guess this is what you wanted.
>
>
> In addition, you can also use ddply() with a different way of grouping: but with the same result.
> library(plyr)
>   dd_dt1<-ddply(dat2,.(Tahun,Bahun),summarize, sum(x))
>   head(dd_dt1)
> #  Tahun Bahun  ..1
> #1    98     1 1252
> #2    98     2  761
> #3    98     3  440
> #4    98     4  597
> #5    98     5  987
> #6    98     6  692
>   tail(dd_dt1)
> #   Tahun Bahun  ..1
> #31     0     7  685
> #32     0     8  504
> #33     0     9  633
> #34     0    10  553
> #35     0    11  914
> #36     0    12 1039
>
> A.K.
>
>
>
>
>
>
> ----- Original Message -----
> From: Roslina Zakaria <zroslina at yahoo.com>
> To: "r-help at r-project.org" <r-help at r-project.org>
> Cc:
> Sent: Friday, October 5, 2012 8:09 PM
> Subject: [R] arrange data
>
> Dear r-users,
>
> I have dailly rainfall data from year 1971 to 2000. I use aggregate to form monthly rainfall data.  What I don't understand is that the data for the year 2000 become on the top, instead of year 1971.  Here are some codes and output:
>
>
> agg_dt1     <- aggregate(x=dt1[,4],by=dt1[,c(1,2)],FUN=sum)
>
>> head(agg_dt1,20); tail(agg_dt1,20)
>     Tahun Bulan     x
> 1      0     1 398.6
> 2     71     1 934.9
> 3     72     1 107.2
> 4     73     1 236.4
> 5     74     1  10.5
> 6     75     1 744.6
> 7     76     1   9.2
> 8     77     1 108.7
> 9     78     1 251.5
> 10    79     1 197.3
> 11    80     1 144.1
> 12    81     1 104.5
> 13    82     1  17.7
> 14    83     1 151.8
> ...
>
> Thank you so much for your help.
>
> Roslina
>      [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list