[R] simple subset question

arun smartpink111 at yahoo.com
Sun Dec 2 22:18:04 CET 2012


Hi,
From the ddply() output, you could get the whole row by:

 fish1 <- structure(list(Year = 2002:2012, maxTotal = c(1464311L, 1071051L, 
714837L, 2115018L, 850491L, 207537L, 321195L, 935599L, 194429L, 
157260L, 303259L)), .Names = c("Year", "maxTotal"), row.names = c(NA, 
-11L), class = "data.frame")


 fish[fish[,2]%in%fish1[,2][fish1[,1]==2012],]  #fish (or winter) is your original dataset
#   IDWeek  Total   Fry  Smolt  FryEq Year
#21     47 303259 34008 269248 491733 2012
A.K.





________________________________
From: Felipe Carrillo <mazatlanmexico at yahoo.com>
To: William Dunlap <wdunlap at tibco.com>; arun <smartpink111 at yahoo.com> 
Cc: R help <r-help at r-project.org> 
Sent: Sunday, December 2, 2012 2:34 PM
Subject: Re: [R] simple subset question



Using my whole dataset I get:
library(plyr)
ddply(winter,"Year",summarise,maxTotal=max(Total))

 fish <- structure(list(Year = 2002:2012, maxTotal = c(1464311L, 1071051L, 
714837L, 2115018L, 850491L, 207537L, 321195L, 935599L, 194429L, 
157260L, 303259L)), .Names = c("Year", "maxTotal"), row.names = c(NA, 
-11L), class = "data.frame")

I only want to extract the max Total for 2012 and want the whole row like this:
 IDWeek  Total   Fry  Smolt  FryEq Year
21     47 303259 34008 269248 491733 2012

My whole dataset is too big to post it so thanks for your help and will try
to figure out why subset returns an empty row

Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish & Wildlife Service
California, USA
http://www.fws.gov/redbluff/rbdd_jsmp.aspx



From: William Dunlap <wdunlap at tibco.com>
>To: Felipe Carrillo <mazatlanmexico at yahoo.com>; arun <smartpink111 at yahoo.com> 
>Cc: R help <r-help at r-project.org> 
>Sent: Sunday, December 2, 2012 11:00 AM
>Subject: RE: [R] simple subset question
>
>> I am
>> still getting an error message
>> >with :
>> >  x <- subset(fish,Year==2012
& Total==max(Total));x
>> >I get:
>> >[1] IDWeek Total  Fry    Smolt  FryEq  Year
>> ><0 rows> (or 0-length row.names)
>
>The above is not an error message.  It says that there
>are no rows satisfying your criteria.  Note that Total==max(Total)
>returns a TRUE for each row in which the Total value
>equals the maximum Total value over all the years in
>the data.  Are you looking for the maximum value of Total
>in each year?
>
>> tmp <- transform(fish, YearlyMaxTotal = ave(Total, Year, FUN=max))
>> subset(tmp, Total==YearlyMaxTotal)
>  IDWeek  Total    Fry  Smolt  FryEq Year YearlyMaxTotal
>21    47 303259  34008 269248 491733 2012        303259
>39    39 157260 156909    351 157506 2011        157260
>> subset(tmp, Total==YearlyMaxTotal
& Year==2012)
>  IDWeek  Total  Fry  Smolt  FryEq Year YearlyMaxTotal
>21    47 303259 34008 269248 491733 2012        303259
>
>Bill Dunlap
>Spotfire, TIBCO Software
>wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
>> Of Felipe Carrillo
>> Sent: Sunday, December 02, 2012 10:47 AM
>> To: arun
>> Cc: R help
>> Subject: Re: [R] simple subset question
>> 
>> Works with the small dataset (2 years) but I get the error message with the whole
>> dataset (12 years of data). I am going to have
>> to check what's
wrong with it...Thanks
>> 
>> Felipe D. Carrillo
>> Supervisory Fishery Biologist
>> Department of the Interior
>> US Fish & Wildlife Service
>> California, USA
>> http://www.fws.gov/redbluff/rbdd_jsmp.aspx
>> 
>> 
>> From: arun <smartpink111 at yahoo.com>
>> >To: Felipe Carrillo <mazatlanmexico at yahoo.com>
>> >Cc: R help <r-help at r-project.org>; R. Michael Weylandt
>> <michael.weylandt at gmail.com>
>> >Sent: Sunday, December 2, 2012 10:29 AM
>> >Subject: Re: [R] simple subset question
>> >
>>
>Hi,
>> >I am getting this:
>> >x<-subset(fish,Year==2012 & Total==max(Total))
>> > x
>> >#   IDWeek  Total   Fry  Smolt  FryEq Year
>> >#21     47 303259 34008 269248 491733 2012
>> >A.K.
>> >
>> >
>> >
>> >
>> >----- Original Message -----
>> >From: Felipe Carrillo <mazatlanmexico at yahoo.com>
>> >To: R. Michael Weylandt <michael.weylandt at gmail.com>
>> >Cc: "r-help at r-project.org" <r-help at r-project.org>
>> >Sent:
Sunday, December 2, 2012 1:25 PM
>> >Subject: Re: [R] simple subset question
>> >
>> >Sorry, I was trying it to subset from a bigger dataset called 'winter' and forgot to
>> change the variable names
>> >when I asked the question. David W suggestion works but the strange part is that I am
>> still getting an error message
>> >with :
>> >  x <- subset(fish,Year==2012 & Total==max(Total));x
>> >I get:
>> >[1] IDWeek Total  Fry    Smolt  FryEq  Year
>> ><0 rows> (or 0-length row.names)
>> >
>> >I will start a fresh session to see if that helps...Thank you all
>> >
>> >Felipe D. Carrillo
>> >Supervisory Fishery Biologist
>> >Department of the Interior
>> >US Fish & Wildlife Service
>> >California, USA
>> >http://www.fws.gov/redbluff/rbdd_jsmp.aspx
>> >
>> >
>> >From: R. Michael Weylandt <michael.weylandt at gmail.com>
>> >>To: Felipe Carrillo <mazatlanmexico at yahoo.com>
>> >>Cc: "r-help at r-project.org" <r-help at r-project.org>
>> >>Sent: Sunday, December 2, 2012 9:42 AM
>> >>Subject: Re: [R] simple subset question
>> >>
>> >>On Sun, Dec 2, 2012 at 5:21 PM, Felipe Carrillo
>> >><mazatlanmexico at yahoo.com> wrote:
>> >>>  Hi,
>> >>> Consider the small dataset below, I want to subset by two variables in
>> >>> one line but it wont work...it works though if I subset separately. I have
>> >>> to be missing something obvious that I did not realize before while using subset..
>> >>>
>> >>> fish <- structure(list(IDWeek = c(27L, 28L, 29L, 30L, 31L, 32L, 33L,
>> >>> 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L,
>> >>> 47L, 48L, 49L, 50L, 51L, 52L, 27L, 28L, 29L, 30L, 31L, 32L, 33L,
>> >>> 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L,
>> >>> 47L, 48L, 49L, 50L, 51L, 52L), Total = c(0L, 0L, 326L, 1735L,
>> >>> 1807L, 2208L, 3883L, 8820L, 6060L, 19326L, 63158L, 100718L, 53015L,
>>
>>> 91689L, 152629L, 122708L, 61293L, 15574L, 86538L, 75365L, 303259L,
>> >>> 19691L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 161L, 321L, 1000L, 4425L,
>> >>> 13202L, 19726L, 30518L, 84949L, 157260L, 145691L, 85801L, 62044L,
>> >>> 44439L, 23272L, 22391L, 20159L, 14854L, 35379L, 31142L, 7736L,
>> >>> 13221L, 4894L), Fry = c(0L, 0L, 326L, 1735L, 1807L, 2208L, 3883L,
>> >>> 8759L, 6060L, 19326L, 63119L, 100524L, 52582L, 88170L, 145564L,
>> >>> 111416L, 38233L, 5248L, 17826L, 11038L, 34008L, 215L, 0L, 0L,
>> >>> 0L, 0L, 0L, 0L, 0L, 0L, 161L, 321L, 1000L, 4425L, 13055L, 19488L,
>> >>> 30518L, 84818L, 156909L, 144786L, 84207L, 57720L, 31049L, 6858L,
>> >>> 1616L, 719L, 364L, 49L, 0L, 0L, 0L, 0L), Smolt = c(0L, 0L, 0L,
>> >>> 0L, 0L, 0L, 0L, 62L, 0L, 0L, 38L, 195L, 433L, 3518L, 7067L, 11290L,
>> >>> 23058L,
10327L, 68712L, 64328L, 269248L, 19479L, 0L, 0L, 0L,
>> >>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 147L, 238L, 0L, 131L, 351L,
>> >>> 905L, 1592L, 4324L, 13391L, 16414L, 20774L, 19444L, 14491L, 35330L,
>> >>> 31142L, 7736L, 13221L, 4894L), FryEq = c(0L, 0L, 326L, 1735L,
>> >>> 1807L, 2208L, 3883L, 8864L, 6060L, 19326L, 63185L, 100854L, 53318L,
>> >>> 94151L, 157576L, 130610L, 77432L, 22805L, 134639L, 120393L, 491733L,
>> >>> 33327L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 161L, 321L, 1000L, 4425L,
>> >>> 13306L, 19894L, 30518L, 85042L, 157506L, 146328L, 86914L, 65073L,
>> >>> 53812L, 34763L, 36931L, 33769L, 24998L, 60110L, 52938L, 13149L,
>> >>> 22476L, 8319L), Year = c(2012L, 2012L, 2012L, 2012L, 2012L, 2012L,
>> >>> 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L,
>> >>> 2012L, 2012L, 2012L,
2012L, 2012L, 2012L, 2012L, 2012L, 2012L,
>> >>> 2012L, 2012L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L,
>> >>> 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L,
>> >>> 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L,
>> >>> 2011L)), .Names = c("IDWeek", "Total", "Fry", "Smolt", "FryEq",
>> >>> "Year"), row.names = c(NA, 52L), class = "data.frame")
>> >>> fish
>> >>> #  Subset to get the max Total for 2012
>> >>>  x <- subset(winter,Year==2012 & Total==max(Total));b  # How come one line doesn't
>> work?
>> >>
>> >>Works fine for me if I change "winter" to fish here.
>> >>
>> >>subset(fish,Year==2012 & Total==max(Total))
>> >>  IDWeek  Total  Fry  Smolt  FryEq Year
>> >>21    47
303259 34008 269248 491733 2012
>> >>
>> >>>
>> >>>  # It works if I subset the year first and then get the Total max from it
>> >>>  xx <- subset(winter,Year==2012)
>> >>> xxx <- subset(xx,Total==max(Total));xxx
>> >>> xxx
>> >>>
>> >>> Felipe D. Carrillo
>> >>> Supervisory Fishery Biologist
>> >>> Department of the Interior
>> >>> US Fish & Wildlife Service
>> >>> California, USA
>> >>> http://www.fws.gov/redbluff/rbdd_jsmp.aspx
>> >>>
>> >>>        [[alternative HTML version deleted]]
>> >>>
>> >>>
>> >>> ______________________________________________
>> >>> R-help at r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >>>
>> >>
>> >>
>> >>
>> >    [[alternative HTML version deleted]]
>> >
>> >
>> >______________________________________________
>> >R-help at r-project.org mailing list
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>> >
>> >
>>     [[alternative HTML version deleted]]
>
>
>
>     




More information about the R-help mailing list