[R] if else for cumulative sum error

Jefferson Ferreira-Ferreira jecogeo at gmail.com
Wed Dec 3 20:04:52 CET 2014


Nice, David!!

Worked like a charm!!
Thank you very much.



Em Tue Dec 02 2014 at 19:22:48, David L Carlson <dcarlson at tamu.edu>
escreveu:

> Let's try a different approach. You don't need a loop for this. First we
> need a reproducible example:
>
> > set.seed(42)
> > dadosmax <- data.frame(above=runif(150) + .5)
>
> Now compute your sums using cumsum() and diff() and then compute enchday
> using ifelse(). See the manual pages for each of these functions to
> understand how they work:
>
> > sums <- diff(c(0, cumsum(dadosmax$above)), 45)
> > dadosmax$enchday <- c(ifelse(sums >= 45, 1, 0), rep(NA, 44))
>
> > dadosmax$enchday
>   [1]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
> 1  1  1
>  [26]  1  1  1  1  1  1  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
> 0  0  0
>  [51]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
> 0  0  0
>  [76]  0  0  0  0  0  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
> 1  1  1
> [101]  1  1  1  1  1  1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA
> [126] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA
>
> See the NA's? Those are what David Winsemius is talking about. For the
> 106th value, 106+44 is 150, but for the 107th value 107+144 is 151 which
> does not exist. Fortunately diff() understands that and stops at 106, but
> we have to add 44 NA's because that is the number of rows in your data
> frame.
>
> You might find this plot informative as well:
>
> > plot(sums, typ="l")
> > abline(h=45)
>
> Another way to get there is to use sapply() which will add the NA's for us:
>
> > sums <- sapply(1:150, function(x) sum(dadosmax$above[x:(x+44)]))
> > dadosmax$enchday <- ifelse(sums >= 45, 1, 0)
>
> But it won't be as fast if you have a large data set.
>
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of David
> Winsemius
> Sent: Tuesday, December 2, 2014 2:50 PM
> To: Jefferson Ferreira-Ferreira
> Cc: r-help at r-project.org
> Subject: Re: [R] if else for cumulative sum error
>
>
> On Dec 2, 2014, at 12:26 PM, Jefferson Ferreira-Ferreira wrote:
>
> > Thank you for replies.
> >
> > David,
> >
> > I tried your modified form
> >
> > for (i in 1:seq_along(rownames(dadosmax))){
>
>
> No. it is either 1: .... or seq_along(...). in this case perhaps
> 1:(nrow(dadosmax)-44 would be safer
>
> You do not seem to have understood that you cannot use an index of i+44
> when i is going to be the entire set of rows of the dataframe. There is "no
> there there" to quote Gertrude Stein's slur against Oakland. In fact there
> is not there there at i+1 when you get to the end. You either need to only
> go to row
>
> >  dadosmax$enchday[i] <- if ( (sum(dadosmax$above[i:(i+44)])) >= 45) 1
> else
> > 0
> > }
> >
> > However, I'm receiving this warning:
> > Warning message:
> > In 1:seq_along(rownames(dadosmax)) :
> >  numerical expression has 2720 elements: only the first used
> >
> > I can't figure out why only the first row was calculated...
>
> You should of course read these, but the error is not from your
> if-statement but rahter you for-loop-indexing.
>
> ?'if'
> ?ifelse
>
>
> > Any ideas?
> >
> >
> >
> > Em Tue Dec 02 2014 at 15:22:25, John McKown <
> john.archie.mckown at gmail.com>
> > escreveu:
> >
> >> On Tue, Dec 2, 2014 at 12:08 PM, Jefferson Ferreira-Ferreira <
> >> jecogeo at gmail.com> wrote:
> >>
> >>> Hello everybody;
> >>>
> >>> I'm writing a code where part of it is as follows:
> >>>
> >>> for (i in nrow(dadosmax)){
> >>>  dadosmax$enchday[i] <- if (sum(dadosmax$above[i:(i+44)]) >= 45) 1
> else 0
> >>> }
> >>>
> >>
> >> ​Without some test data for any validation, I would try the following
> >> formula
> >>
> >> dadosmax$enchday[i] <- if
> >> (sum(dadosmax$above[i:(min(i+44,nrow(dadosmax)))] >= 45) 1 else 0​
> >>
> >>
> >>
> >>>
> >>> That is for each row of my data frame, sum an specific column (0 or 1)
> of
> >>> that row plus 44 rows. If It is >=45 than enchday is 1 else 0.
> >>>
> >>> The following error is returned:
> >>>
> >>> Error in if (sum(dadosmax$above[i:(i + 44)]) >= 45) 1 else 0 :
> >>>  missing value where TRUE/FALSE needed
> >>>
> >>> I've tested the ifelse statement assigning different values to i and it
> >>> works. So I'm wondering if this error is due the fact that at the
> final of
> >>> my data frame there aren't 45 rows to sum anymore. I tried to use "try"
> >>> but
> >>> It's simply hide the error.
> >>>
> >>> How can I deal with this? Any ideas?
> >>> Thank you very much.
> >>>
> >>>        [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >>
> >>
> >> --
> >> The temperature of the aqueous content of an unremittingly ogled
> >> culinary vessel will not achieve 100 degrees on the Celsius scale.
> >>
> >> Maranatha! <><
> >> John McKown
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list