[R] Handling time-series-Data

Gabor Grothendieck ggrothendieck at gmail.com
Thu Sep 11 15:00:02 CEST 2008


On Thu, Sep 11, 2008 at 3:37 AM, Kunzler, Andreas <a.kunzler at bzaek.de> wrote:
> Dear List,
>
> I ran into some problems with time-series-Data.
>
> Imagine a data-structure where observations (x) of test attendants (i) are made a four times (q) a year (y). The data is orderd the following way:
> I       y       q       x
> 1       2006    1       1
> 1       2006    3       1
> 1       2006    4       1
> 1       2007    1       1
> 1       2007    2       1
> 1       2007    3       1
> 1       2007    4       1
> 2       2006    1       1
> 3       2007    1       1
> 3       2007    2       1
>
> I am looking for a way to count the attendants that at least have attendend one time a year. In this case 2 persons, because i=2 has no observation in 2007.
>


Don't you mean 1 person, not 2 persons, since
- attendant 1 appears in both years but
- attendant 2 appears only in 2006
- attendant 3 appears only in 2007
so only attendant 1 appears in both years, i.e. 1 person.

Assuming DF is your data frame:

u <- unique(DF[1:2])
with(u, sum(tapply(y, I, length) == length(unique(y))))  # 1


> I thought about creating a subset with the duplicate function. But I can't find a way to control (i) and (y).
>
> subset(data, !duplicated(i[y]))
>
> Thanx so much
>
> Andreas Kunzler
> ____________________________
> Bundeszahnärztekammer (BZÄK)
> Chausseestraße 13
> 10115 Berlin
>
> Tel.: 030 40005-113
> Fax:  030 40005-119
>
> E-Mail: a.kunzler at bzaek.de
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list