[R] Re : Sort out number on value

David Winsemius dwinsemius at comcast.net
Fri Apr 20 21:25:25 CEST 2012


On Apr 20, 2012, at 3:00 PM, Jeff Newmiller wrote:

> I think
>
> x[x>7.5]
>

 > y <-c(1,1)
 > y[y>2]
numeric(0)
 > y[which(y>2)]
numeric(0)

> gives more unsurprising results when none of the data meets the  
> criteria than
>
> x[which(x>7.5)]

I don't see a difference.

Look at:

 > x <-c(NA, 1)
 > x[which(x >2)]
numeric(0)
 > x[x>0]
[1] NA  1
 > x[which(x >0)]
[1] 1

 > length( x[x>0])
[1] 2
 > length( x[which(x>0)])
[1] 1


I hope reasonable people can disagree on this one.

Using 'which' gives more unsurprising results when the logical test is  
applied to a large dataset for which the number of NA's exceeds the  
number of targets by a large margin. There are differences of opinion  
as to which surprise is most undesirable. There are also gotcha's  
regarding the use of "-" in front of 'which'. I seem to have a false  
memory that there is an isTRUE function that is the _correct_ way of  
doing this, but I cannot recover that lost memory and it appears that  
isTRUE is not vectorized.

-- 
David.

>
> does.
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go  
> Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.   
> Live Go...
>                                      Live:   OO#.. Dead: OO#..   
> Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.   
> rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> David Winsemius <dwinsemius at comcast.net> wrote:
>
>>
>> On Apr 20, 2012, at 9:49 AM, Yellow wrote:
>>
>>> I now filtered the Na and Inf out of my data.
>>> And the number is exactly the same als the output from the excel
>> file.
>>>
>>> Thanks everyone. :)
>>> Now I can finish my work.
>>
>> In the future it might be safer to use subset() or perhaps
>> x[which(x>7.5)]. That would omit the NA or NaN values (although it
>> might not remove the Inf values, but I didn't realize the Excel had a
>> concept of Inf).
>>
>> -- 
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list