[R] aggregate and the $ operator

Fri Jan 22 22:38:02 CET 2016

Using column names where you used column numbers would work:

example <- data.frame(
    check.names = FALSE,
    Nuclei = c(133L, 96L, 62L, 60L),
    `Positive Nuclei` = c(96L, 70L, 52L, 50L),
    Slide = factor(c("A1", "A1", "A2", "A2"), levels = c("A1", "A2")))
aggregate(example["Nuclei"], by=example["Slide"], sum)
#  Slide Nuclei
#1    A1    229
#2    A2    122
aggregate(example[1], by=example[3], sum)
#  Slide Nuclei
#1    A1    229
#2    A2    122

Many people find that the functions in the dplyr or plyr packages
are worth the trouble to learn about.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Jan 22, 2016 at 11:20 AM, Ed Siefker <ebs15242 at gmail.com> wrote:

> Aggregate does the right thing with column names when passing it
> numerical coordinates.
> Given a dataframe like this:
>
>   Nuclei Positive Nuclei Slide
> 1    133              96    A1
> 2     96              70    A1
> 3     62              52    A2
> 4     60              50    A2
>
> I can call 'aggregate' like this:
>
> > aggregate(example[1], by=example[3], sum)
>   Slide Nuclei
> 1    A1    229
> 2    A2    122
>
> But that means I have to keep track of which column is which number.
> If I try it the
> easy way, it doesn't keep track of column names and it forces me to
> coerce the 'by'
> to a list.
>
> > aggregate(example$Nuclei, by=list(example$Slide), sum)
>   Group.1   x
> 1      A1 229
> 2      A2 122
>
> Is there a better way to do this?  Thanks
> -Ed
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]