[R] Accessing the index of factor in by() function

Sarah Goslee sarah.goslee at gmail.com
Mon Aug 1 19:52:01 CEST 2011


Merik,

You did get an answer to the question, and it's even included in the material
below.

What doesn't work for you in Ista's suggestion?

id    <- c(1,1,1,1,1,2,2,2,3,3,3)
month <- c(1, 1, 2, 3, 6, 2, 3, 6, 1, 3, 5)
value <- c(10, 12, 11, 14, 16, 12, 10, 8, 14, 11, 15)
dat.tmp <- data.frame(id, month, value)

my.plot <- function(dat) {print(dat[, c("id", "value")])}
by(dat.tmp, id, my.plot)

But if for some reason you need to get the separate sections, not just
act on them, this might also work:


dat.split <- split(dat.tmp, dat.tmp$id)
lapply(dat.split, my.plot)

Sarah

On Mon, Aug 1, 2011 at 1:34 PM, Merik Nanish <merik.nanish at gmail.com> wrote:
> Since I didn't get an answer to this question, I'm rephrasing my question in
> simpler terms:
>
> I have  a dataframe and I want to split it based on the levels of one of its
> columns, and apply a function to each section of the data. Output of the
> function may be drawing a plot, returning  a value, whatever. I want to do
> it efficiently though (for loops are very slow).
>
> How can I do that?
>
> M
>
> On Tue, Jul 26, 2011 at 10:12 AM, Ista Zahn <izahn at psych.rochester.edu>wrote:
>
>> Hi Merik,
>> Please keep the mailing list copied.
>>
>> On Tue, Jul 26, 2011 at 6:44 AM, Merik Nanish <merik.nanish at gmail.com>
>> wrote:
>> > You can convert my data into a dataframe simply by dat <- data.frame(id,
>> > month, value). That doesn't help though.
>>
>> Can you be more specific? What is the problem you are having?
>>
>> And no, that's not what I'm looking
>> > for. What I intend to do is for by to loop through the data based on
>> levels
>> > of "id" factor (1,2, and 3), and for each level, for my function to
>> printout
>> > the values of "value" and "month" belonging to the section of data with
>> that
>> > "id".
>>
>> OK, easy enough:
>>
>> dat.tmp <- data.frame(id, month, value)
>> my.plot <- function(dat) {print(dat[, c("id", "value")])}
>> by(dat.tmp, id, my.plot)
>>
>> > Right now, I achieve this with a for loop but I want to avoid looping in
>> the
>> > data as much as possible.
>>
>> Why? What do you have against loops?
>>
>> Best,
>> Ista
>>
>> >
>> > On Tue, Jul 26, 2011 at 12:18 AM, Ista Zahn <izahn at psych.rochester.edu>
>> > wrote:
>> >>
>> >> Hi Merik,
>> >> by() works most easily with data.frames. Is this what you are after?
>> >>
>> >> my.plot <- function(dat) { print(dat$value);
>> >> print(dat$month[dat$id==dat$value]) }
>> >> by(dat.tmp, id, my.plot)
>> >>
>> >> Best,
>> >> Ista
>> >>
>> >> On Mon, Jul 25, 2011 at 9:19 PM, Merik Nanish <merik.nanish at gmail.com>
>> >> wrote:
>> >> > Hello,
>> >> >
>> >> > Here are three vectors to give context to my question below:
>> >> >
>> >> > *id    <- c(1,1,1,1,1,2,2,2,3,3,3))
>> >> > month <- c(1, 1, 2, 3, 6, 2, 3, 6, 1, 3, 5)
>> >> > value <- c(10, 12, 11, 14, 16, 12, 10, 8, 14, 11, 15)*
>> >> >
>> >> > and I want to plot "value" over "month" separately for each "id".
>> Before
>> >> > I
>> >> > can do that, I need to section both month and value, based on ID. I
>> >> > create a
>> >> > my.plot function like this (at this point, it doesn't draw any plots,
>> it
>> >> > is
>> >> > just an effort to help my understand what I'm doing):
>> >> >
>> >> > *my.plot <- function(y) { print(y); print(month[id==y]) }*
>> >> >
>> >> > Now, I tried:
>> >> >
>> >> > *by(value, id, my.plot)*
>> >> >
>> >> > But of course, it didn't do what I wanted. I realized that the
>> parameter
>> >> > passed to my.plot, is a "secion of value" per ID, and not the ID value
>> >> > itself. Question is, how can I get the value of factor ID at each
>> level
>> >> > of
>> >> > by()?
>> >> >
>> >> > Please advise,
>> >> >
>> >> > Merik
>> >> >



-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list