[R] group bunch of lines in a data.frame, an additional requirement

Emmanuel Levy emmanuel.levy at gmail.com
Thu Sep 14 22:35:33 CEST 2006


Thanks Gabor, that is much faster than using a loop!

I've got a last question:

Can you think of a fast way of keeping track of the number of
observations collapsed for each entry?

i.e. I'd like to end up with:

A 2.0 400 ID1 3 (3obs in the first matrix)
B 0.7 35 ID2 2 (2obs in the first matrix)
C 5.0 70 ID1 1 (1obs in the first matrix)

Or is it required to use an temporary matrix that is merged later? (As
examplified by Mark in a previous email?)

Thanks a lot for your help,

  Emmanuel

On 9/13/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> See below.
>
> On 9/13/06, Emmanuel Levy <emmanuel.levy at gmail.com> wrote:
> > Thanks for pointing me out "aggregate", that works fine!
> >
> > There is one complication though: I have mixed types (numerical and character),
> >
> > So the matrix is of the form:
> >
> > A 1.0 200 ID1
> > A 3.0 800 ID1
> > A 2.0 200 ID1
> > B 0.5 20   ID2
> > B 0.9 50   ID2
> > C 5.0 70   ID1
> >
> > One letter always has the same ID but one ID can be shared by many
> > letters (like ID1)
> >
> > I just want to keep track of the ID, and get a matrix like:
> >
> > A 2.0 400 ID1
> > B 0.7 35 ID2
> > C 5.0 70 ID1
> >
> > Any idea on how to do that without a loop?
>
> If V4 is a function of V1 then you can aggregate by it too and it will
> appear but have no effect on the classification:
>
> > aggregate(DF[2:3], DF[c(1,4)], mean)
>   V1  V4  V2  V3
> 1  A ID1 2.0 400
> 2  C ID1 5.0  70
> 3  B ID2 0.7  35
>
>
> >
> >  Many thanks,
> >
> >     Emmanuel
> >
> > On 9/12/06, Emmanuel Levy <emmanuel.levy at gmail.com> wrote:
> > > Hello,
> > >
> > > I'd like to group the lines of a matrix so that:
> > > A 1.0 200
> > > A 3.0 800
> > > A 2.0 200
> > > B 0.5 20
> > > B 0.9 50
> > > C 5.0 70
> > >
> > > Would give:
> > > A 2.0 400
> > > B 0.7 35
> > > C 5.0 70
> > >
> > > So all lines corresponding to a letter (level), become a single line
> > > where all the values of each column are averaged.
> > >
> > > I've done that with a loop but it doesn't sound right (it is very
> > > slow). I imagine there is a
> > > sort of "apply" shortcut but I can't figure it out.
> > >
> > > Please note that it is not exactly a matrix I'm using, the function
> > > "typeof" tells me it's a list, however I access to it like it was a
> > > matrix.
> > >
> > > Could someone help me with the right function to use, a help topic or
> > > a piece of code?
> > >
> > > Thanks,
> > >
> > >   Emmanuel
> > >
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



More information about the R-help mailing list