[R] Subset data in long format

Gabor Grothendieck ggrothendieck at gmail.com
Tue Jun 6 23:37:48 CEST 2006


Try this:

subset(long, seq(id) - match(id,id) < 6)

On 6/6/06, Doran, Harold <HDoran at air.org> wrote:
> I have data in a "long" format where each row is a student and each
> student occupies multiple rows with multiple observations. I need to
> subset these data based on a condition which I am having difficulty
> defining.
>
> The dataset I am working with is large, but here is a simple data
> structure to illustrate the issue
>
> tmp <- data.frame(id = 1:3, matrix(rnorm(30), ncol=10) )
> long <- reshape(tmp, idvar='id', varying=list(names(tmp)[2:11]),
> v.names=('item'),timevar='position' , direction='long')
> long <- long[order(long$id) , ]
> long <- long[c(-2,-13),]
>
> What I need to do is subset these data so I have the first 6 rows for
> each unique ID. The problem is that the data are unbalanced in that each
> ID has a different number of observations (which I why I removed obs 2
> and 13).
>
> If the data were balanced, the subset would be trivial and I could just
> do
>
> long <- subset(long, position < 7)
>
> However, the data are not balanced. Consequently, if I were to do this
> for the unbalanced data I would not have the first 6 obs for the first
> ID. I would only have the first 5. Theoretically, what I want for
> id1(and for each unique id) is this
>
> ID1 <- subset(long, id==1)
> ID1[1:6,]
>
> However, the goal is to subset the entire dataframe at once such that
> the subset returns a new dataframe with the first 6 rows for each unique
> id. Is there a feasible method for doing this subset that anyone can
> suggest? My actual dataset has more than 24,000 unique ids, so I am
> hoping to avoid looping through this if possible.
>
> Thanks,
> Harold
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



More information about the R-help mailing list