[R] by group problem

Gabor Grothendieck ggrothendieck at gmail.com
Fri Aug 31 15:52:40 CEST 2007


See the examples labelled head in the examples section near the bottom of:

http://sqldf.googlecode.com/svn/trunk/man/sqldf.Rd

These show show to do it using order as well as using SQL via sqldf.

On 8/31/07, Cory Nissen <cnissen at akoyainc.com> wrote:
> I am working with census data.  My columns of interest are...
>
> PercentOld - the percentage of people in each county that are over 65
> County - the county in each state
> State - the state in the US
>
> There are about 3100 rows, with each row corresponding to a county within a state.
>
> I want to return the top five "PercentOld" by state.  But I want the County and the Value.
>
> I tried this...
>
> topN <- function(column, n=5)
>  {
>    column <- sort(column, decreasing=T)
>    return(column[1:n])
>  }
> top5PerState <- tapply(data$percentOld, data$STATE, topN)
>
> But this only returns the value for "percentOld" per state, I also want the corresponding County.
>
> I think I'm close, but I just can't get it...
>
> Thanks
>
> cn
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list