[R] data frame manipulation with zero rows

arnaud Gaboury arnaud.gaboury at gmail.com
Wed Jun 2 16:19:12 CEST 2010


I do really think it is a very good idea.
TY





> -----Original Message-----
> From: h.wickham at gmail.com [mailto:h.wickham at gmail.com] On Behalf Of
> Hadley Wickham
> Sent: Wednesday, June 02, 2010 3:31 PM
> To: arnaud Gaboury
> Cc: Peter Ehlers; r-help at r-project.org; Prof Brian Ripley
> Subject: Re: [R] data frame manipulation with zero rows
> 
> Hi Arnaud,
> 
> I've added this case to the set of test cases in plyr and it will be
> fixed in the next version.
> 
> Hadley
> 
> On Tue, Jun 1, 2010 at 2:33 PM, arnaud Gaboury
> <arnaud.gaboury at gmail.com> wrote:
> > Maybe not the cleanest way, but I create a fake data frame with one
> row so
> > ddply() is happy!!
> >> if (nrow(futures)==0) futures<-data.frame(.......)
> >
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: Peter Ehlers [mailto:ehlers at ucalgary.ca]
> >> Sent: Tuesday, June 01, 2010 12:07 PM
> >> To: arnaud Gaboury
> >> Cc: 'Prof Brian Ripley'; r-help at r-project.org
> >> Subject: Re: [R] data frame manipulation with zero rows
> >>
> >> On 2010-06-01 1:53, arnaud Gaboury wrote:
> >> > Brian,
> >> >
> >> > If I do understand correctly, I must use in my function something
> >> else than
> >> > ddply() if I want to avoid any error each time my df has zero
> rows?
> >> > Am I correct?
> >> >
> >>
> >> You could define a function to handle the zero-rows case:
> >>
> >> f <- function(x){
> >>   if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
> >>   else
> >>     out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
> >>                      POSITION=sum(QUANTITY))[,c(1,3,2)]
> >>   out
> >> }
> >> f(futures)
> >>
> >>   -Peter Ehlers
> >>
> >> >
> >> >
> >> >> -----Original Message-----
> >> >> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
> >> >> Sent: Tuesday, June 01, 2010 9:47 AM
> >> >> To: arnaud Gaboury
> >> >> Subject: Re: [R] data frame manipulation with zero rows
> >> >>
> >> >> On Tue, 1 Jun 2010, arnaud Gaboury wrote:
> >> >>
> >> >>> Dear group,
> >> >>>
> >> >>> Here is the kind of data.frame I obtain every day with my
> function
> >> :
> >> >>>
> >> >>> futures<-
> >> >>> structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
> >> >>> "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE
> Aug/10",
> >> >>> "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11
> Jul/10",
> >> >>> "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
> >> >>> ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
> >> >>> 18407, 18408, 18406, 18407, 18407, 18407, 18407), class =
> "Date"),
> >> >>>     QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT
> =
> >> >>> c("373.2500",
> >> >>>     "373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
> >> >>>     "90.7750", "14.9200", "14.9200", "14.9200", "14.9200",
> >> "14.9200"
> >> >>>     )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
> >> >>> "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
> >> >>>
> >> >>> I need then to apply to the df this following code line :
> >> >>>
> >> >>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >> >> POSITION=
> >> >>> sum(QUANTITY))[,c(1,3,2)]
> >> >>>
> >> >>> It works perfectly in most of case, BUT I have a new problem: it
> >> can
> >> >>> sometime occurs that my df "futures" is empty, with zero rows.
> >> >>>
> >> >>>
> >> >>> futures<-
> >> >>> structure(list(DESCRIPTION = character(0), CREATED.DATE =
> >> >>> structure(numeric(0), class = "Date"),
> >> >>>     QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
> >> >>> c("DESCRIPTION",
> >> >>> "CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names =
> integer(0),
> >> >> class =
> >> >>> "data.frame")
> >> >>>
> >> >>> It is not the usual case, but it can happen. With this df, when
> I
> >> >> pass the
> >> >>> above mentione line, I get an error :
> >> >>>
> >> >>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >> >> POSITION=
> >> >>> sum(QUANTITY))[,c(1,3,2)]
> >> >>> Error in tapply(1:nrow(data), splitv, list) :
> >> >>>   arguments must have same length
> >> >>>
> >> >>>
> >> >>> How can I avoid this when my df is empty?
> >> >>
> >> >> Ask the author of the (missing) function ddply() to correct the
> >> error
> >> >> of using 1:nrow(data) by replacing it by seq_len(nrow(data)).
> >> >>
> >> >> It's helpful to give example code, but much more helpful if you
> test
> >> >> it: yours cannot work without the function ddply() -- this is
> what
> >> >> 'self-contained' means in the footer here.
> >> >>
> >> >>
> >> >>>
> >> >>> Any help is appreciated
> >> >>>
> >> >>> ______________________________________________
> >> >>> R-help at r-project.org mailing list
> >> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >>> PLEASE do read the posting guide http://www.R-
> project.org/posting-
> >> >> guide.html
> >> >>> and provide commented, minimal, self-contained, reproducible
> code.
> >> >>
> >> >>
> >> >> --
> >> >> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> >> >> Professor of Applied Statistics,
>  http://www.stats.ox.ac.uk/~ripley/
> >> >> University of Oxford,             Tel:  +44 1865 272861 (self)
> >> >> 1 South Parks Road,                     +44 1865 272866 (PA)
> >> >> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> >> >
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> 
> --
> Assistant Professor / Dobelman Family Junior Chair
> Department of Statistics / Rice University
> http://had.co.nz/



More information about the R-help mailing list