[R] help with for loop: new column giving count of observation for each SITEID

Bert Gunter gunter.berton at gene.com
Tue Oct 30 23:39:03 CET 2012


Eek!

Just a bit simpler would be (á la Dr. Dunlap): (d is the data frame):

d <- within(d,index <- ave(year,site, FUN = order))

(This assumes exactlly one data collection per each year that appears, though.)

Cheers,
Bert

On Tue, Oct 30, 2012 at 12:56 PM, arun <smartpink111 at yahoo.com> wrote:
> HI,
>
> You can also use this:res<-do.call(rbind,lapply(split(d,d$site),function(x) data.frame(x,newindex=1:nrow(x))))
>  rownames(res)<-1:nrow(res)
>  res
> #  RchID site year index newindex
> #1     1    A 2002     1        1
> #2     2    A 2004     2        2
> #3     3    A 2005     3        3
> #4     4    B 2003     1        1
> #5     5    B 2006     2        2
> #6     6    B 2008     3        3
> #7     7    C 2002     1        1
> #8     8    C 2003     2        2
> #9     9    C 2004     3        3
> A.K.
>
>
>
> ----- Original Message -----
> From: William Dunlap <wdunlap at tibco.com>
> To: "Meredith, Christy S -FS" <csmeredith at fs.fed.us>
> Cc: "r-help at r-project.org" <r-help at r-project.org>
> Sent: Tuesday, October 30, 2012 3:43 PM
> Subject: Re: [R] help with for loop: new column giving count of observation for each SITEID
>
> Your data was, in R-readable format (from dput())
>   d <- data.frame(
>        RchID = 1:9,
>        site = factor(c("A", "A", "A", "B", "B", "B", "C",
>           "C", "C"), levels = c("A", "B", "C")),
>        year = c(2002L, 2004L, 2005L, 2003L, 2006L, 2008L,
>           2002L, 2003L, 2004L),
>        index = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L))
> and I am assuming that 'index' is the desired result.  You can use
> withinGroupIndex to make a new column identical to 'index'.  There
> are a variety of ways to add that column to an existing data.frame,
> one of which is within():
>   > within(d, newIndex <- withinGroupIndex(site))
>     RchID site year index newIndex
>   1     1    A 2002     1        1
>   2     2    A 2004     2        2
>   3     3    A 2005     3        3
>   4     4    B 2003     1        1
>   5     5    B 2006     2        2
>   6     6    B 2008     3        3
>   7     7    C 2002     1        1
>   8     8    C 2003     2        2
>   9     9    C 2004     3        3
> Or is 'index' not the desired result?
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: Meredith, Christy S -FS [mailto:csmeredith at fs.fed.us]
>> Sent: Tuesday, October 30, 2012 12:20 PM
>> To: William Dunlap
>> Subject: RE: [R] help with for loop: new column giving count of observation for each
>> SITEID
>>
>> Not quite,
>>  I need it like this, a new number for each ordered year in the sequence within each site,
>> regardless of what the years are,  and to retain the RchID column.
>>
>> RchID    site    year    index
>> 1    A    2002    1
>> 2    A    2004    2
>> 3    A    2005    3
>> 4    B    2003    1
>> 5    B    2006    2
>> 6    B    2008    3
>> 7    C    2002    1
>> 8    C    2003    2
>> 9    C    2004    3
>>
>>
>> Thanks so much for you help!
>>
>>
>> -----Original Message-----
>> From: William Dunlap [mailto:wdunlap at tibco.com]
>> Sent: Tuesday, October 30, 2012 1:07 PM
>> To: Meredith, Christy S -FS; r-help at R-project.org
>> Subject: RE: [R] help with for loop: new column giving count of observation for each
>> SITEID
>>
>> Is this what you want?
>>   > withinGroupIndex <- function(group, ...) ave(integer(length(group)), group, ...,
>> FUN=seq_along)
>>   > site <- c("A","A","C","D","C","A","B")
>>   > data.frame(site, index=withinGroupIndex(site))
>>     site index
>>   1    A     1
>>   2    A     2
>>   3    C     1
>>   4    D     1
>>   5    C     2
>>   6    A     3
>>   7    B     1
>>
>> You can add more arguments if the groups depend on more than one value:
>>   > year <- rep(c(1985, 2012), c(4,3))
>>   > data.frame(site, year, index=withinGroupIndex(site, year))
>>     site year index
>>   1    A 1985     1
>>   2    A 1985     2
>>   3    C 1985     1
>>   4    D 1985     1
>>   5    C 2012     1
>>   6    A 2012     1
>>   7    B 2012     1
>>
>> Bill Dunlap
>> Spotfire, TIBCO Software
>> wdunlap tibco.com
>>
>>
>> > -----Original Message-----
>> > From: r-help-bounces at r-project.org
>> > [mailto:r-help-bounces at r-project.org] On Behalf Of Meredith, Christy S
>> > -FS
>> > Sent: Tuesday, October 30, 2012 11:17 AM
>> > To: r-help at R-project.org
>> > Subject: [R] help with for loop: new column giving count of
>> > observation for each SITEID
>> >
>> >
>> > Hello,
>> > I think this is easy, but I can't seem to find a good way to do this
>> > in the R help. I have a list of sites, with multiple years of data for
>> > each site id. I want to create a new column that gives a number
>> > describing whether it is the 1st year ("1" ) the data was collected
>> > for the site, the second year ("2"), etc. I have different years for
>> > each siteid, but I don't care which year it was collected, just the order that it is in for
>> that siteid.  This is what I have so far, but it doesn't do the analysis separately for each
>> SiteID.
>> >
>> > indexi<-indexg[order(indexg$SiteID,indexg$Yr),]
>> >
>> > obs=0
>> > indexi=na.omit(indexi)
>> > for(i in 1:length(indexi$SiteID)){
>> > obs=obs+1
>> > indexi$obs[i]=obs
>> > }
>> >
>> >
>> > Thanks for any help you can give.
>> >
>> > Christy Meredith
>> > USDA Forest Service
>> > Rocky Mountain Research Station
>> > PIBO Monitoring
>> > Data Analyst
>> > Voice: 435-755-3573
>> > Fax: 435-755-3563
>> >
>> >
>> >
>> >
>> >
>> > This electronic message contains information generated by the USDA
>> > solely for the intended recipients. Any unauthorized interception of
>> > this message or the use or disclosure of the information it contains
>> > may violate the law and subject the violator to civil or criminal
>> > penalties. If you believe you have received this message in error, please notify the
>> sender and delete the email immediately.
>> >
>> >     [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm




More information about the R-help mailing list