[R] Yearly statistics
Gabor Grothendieck
ggrothendieck at gmail.com
Mon May 28 14:34:40 CEST 2007
Here are a couple of solutions:
1. using zoo package
First add Date to the header so there
are the same number of column headers as columns and
then read in using read.zoo. Then aggregate over years
using mean. For more on zoo try library(zoo); vignette("zoo")
and for more on dates see the R News 4/1 help desk article.
# added Date to the header
Lines <- "Date open high low close hc lc
2004-12-29 4135 4135 4106 4116 8 -21
2004-12-30 4120 4131 4115 4119 15 -1
2004-12-31 4123 4124 4114 4117 5 -5
2005-01-04 4106 4137 4103 4137 20 -14
2005-01-06 4085 4110 4085 4096 10 -15
2005-01-10 4133 4148 4122 4139 15 -11
2005-01-11 4142 4158 4127 4130 19 -12
2005-01-12 4113 4138 4112 4127 18 8
"
library(zoo)
# z <- read.zoo("myfile.dat", header = TRUE)
z <- read.zoo(textConnection(Lines), header = TRUE)
aggregate(z[,"hc"] > 0 & z[,"lc"] < 0, function(x) format(x, "%Y"), mean)
2. Using data frames and tapply
Read in as a data frame, calculate year and tapply the mean
by year:
# Lines is from above
# dat <- read.table("myfile.dat", header = TRUE)
dat <- read.table(textConnection(Lines), header = TRUE)
year <- as.numeric(format(as.Date(dat$Date), "%Y"))
tapply(dat$hc > 0 & dat$lc < 0, year, mean)
On 5/27/07, Alfonso Sammassimo <cincinattikid at bigpond.com> wrote:
> Dear R-experts,
>
> Sorry if I've overlooked a simple solution here. I have calculated a
> proportion of the number of observations which meet a criteria, applied to
> five years of data. How can I break down this proportion statistic for each
> year?
>
> For example (data in zoo format):
>
> open high low close hc lc
> 2004-12-29 4135 4135 4106 4116 8 -21
> 2004-12-30 4120 4131 4115 4119 15 -1
> 2004-12-31 4123 4124 4114 4117 5 -5
> 2005-01-04 4106 4137 4103 4137 20 -14
> 2005-01-06 4085 4110 4085 4096 10 -15
> 2005-01-10 4133 4148 4122 4139 15 -11
> 2005-01-11 4142 4158 4127 4130 19 -12
> 2005-01-12 4113 4138 4112 4127 18 8
>
> Statistic of interest is proportion of times that sign of "hc" is positive
> and sign of "lc" is negative on any given day. Looking to return something
> like:
>
> Yr Prop
> 2004 1.0
> 2005 0.8
>
> Along these lines, if I have datasets A and B, where B is a subset of A, can
> I use the number of matching dates to calculate the yearly proportions in
> question?
>
> Thanks,
> Alfonso Sammassimo
> Melbourne Australia
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list