[R] Sum function and missing values --- need to mimic SAS sum function

Hervé Pagès hpages at fredhutch.org
Thu Jan 29 03:11:46 CET 2015


On 01/27/2015 02:54 AM, Bert Gunter wrote:
> Huh??
>
>> ifelse(TRUE, a <- 2L, a <- 3L)
> [1] 2
>> a
> [1] 2
>
> Please clarify.

In Bioconductor ifelse() is a generic function (with methods for Rle
objects) so all its arguments are evaluated before dispatch can
happen. You can reproduce with:

setGeneric("ifelse")

## A dummy method so the dispatch mechanism will need to evaluate th
## 'no' arg before dispatch can actually happen.
setMethod("ifelse", c(no="data.frame"),
   function(test, yes, no)
     stop("I'm kind of broken on data frames, don't use me like that"))

Then:

   > ifelse(TRUE, a <- 2L, a <- 3L)
   [1] 2
   > a
   [1] 3

Delay evaluation is a world full of surprises...

H.

>
> -- Bert
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> Clifford Stoll
>
>
>
>
> On Mon, Jan 26, 2015 at 2:22 PM, Hervé Pagès <hpages at fredhutch.org> wrote:
>> Hi Martin,
>>
>> On 01/26/2015 04:45 AM, Martin Maechler wrote:
>>>>>>>>
>>>>>>>> Jim Lemon <drjimlemon at gmail.com>
>>>>>>>>       on Mon, 26 Jan 2015 11:21:03 +1100 writes:
>>>
>>>
>>>       > Hi Allen, How about this:
>>>
>>>       > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE))
>>>
>>> Excuse, Jim, but that's yet another  "horrible misuse of  ifelse()"
>>>
>>> John Fox's reply *did* contain  the "proper" solution
>>>
>>>        if (all(is.na(x))) NA else sum(x, na.rm=TRUE)
>>>
>>> The ifelse() function should never be used in such cases.
>>> Read more after googling
>>>
>>>       "Do NOT use ifelse()"
>>>
>>>       -- include the quotes in your search --
>>>
>>> or directly at
>>>      http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html
>>
>>
>> Interesting. You could have added the following item to your list:
>>
>>    4. less likely to play strange tricks on you:
>>
>>       > ifelse(TRUE, a <- 2L, a <- 3L)
>>       [1] 2
>>       > a
>>       [1] 3
>>
>> Yeah I've seen people using ifelse() that way and being totally
>> confused...
>>
>> Cheers,
>> H.
>>
>>>
>>> Yes, this has been on R-help a month ago..
>>> Martin
>>>
>>>       > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham
>>>       > <aebingham2 at gmail.com> wrote:
>>>       >> I understand that in order to get the sum function to
>>>       >> ignore missing values I need to supply the argument
>>>       >> na.rm=TRUE. However, when summing numeric values in which
>>>       >> ALL components are "NA" ... the result is 0.0 ... instead
>>>       >> of (what I would get from SAS) of NA (or in the case of
>>>       >> SAS ".").
>>>       >>
>>>       >> Accordingly, I've had to go to 'extreme' measures to get
>>>       >> the sum function to result in NA if all arguments are
>>>       >> missing (otherwise give me a sum of all non-NA elements).
>>>       >>
>>>       >> So for example here's a snippet of code that ALMOST does
>>>       >> what I want:
>>>       >>
>>>       >>
>>>       >>
>>> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
>>>       >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>>>       >>
>>>       >> In reality this does NOT give me records with NA for
>>>       >> SumValue ... but it doesn't give me values for any
>>>       >> records in which both Variable.1 and Variable.2 are NA
>>>       >> --- which is "good enough" for my purposes.
>>>       >>
>>>       >> I'm guessing with a little more work I could come up with
>>>       >> a way to adapt the code above so that I could get it to
>>>       >> work like SAS's sum function ...
>>>       >>
>>>       >> ... but before I go that extra mile I thought I'd ask
>>>       >> others if they know of functions in either base R ... or
>>>       >> in a package that will better mimic the SAS sum function.
>>>       >>
>>>       >> Any suggestions?
>>>       >>
>>>       >> Thanks.  ______________________________________ Allen
>>>       >> Bingham aebingham2 at gmail.com
>>>       >>
>>>       >> ______________________________________________
>>>       >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
>>>       >> more, see https://stat.ethz.ch/mailman/listinfo/r-help
>>>       >> PLEASE do read the posting guide
>>>       >> http://www.R-project.org/posting-guide.html and provide
>>>       >> commented, minimal, self-contained, reproducible code.
>>>
>>>       > ______________________________________________
>>>       > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
>>>       > more, see https://stat.ethz.ch/mailman/listinfo/r-help
>>>       > PLEASE do read the posting guide
>>>       > http://www.R-project.org/posting-guide.html and provide
>>>       > commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages at fredhutch.org
>> Phone:  (206) 667-5791
>> Fax:    (206) 667-1319
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-help mailing list