[R] [FORGED] Q re: logical indexing with is.na

Jeff Newmiller jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Sun Mar 10 06:07:17 CET 2019


Regarding the mention of logical indexing, under ?Extract I see:

For [-indexing only: i, j, ... can be logical vectors, indicating elements/slices to select. Such vectors are recycled if necessary to match the corresponding extent. i, j, ... can also be negative integers, indicating elements/slices to leave out of the selection.

On March 9, 2019 6:57:05 PM PST, Rolf Turner <r.turner using auckland.ac.nz> wrote:
>On 3/10/19 2:36 PM, David Goldsmith wrote:
>> Hi!  Newbie (self-)learning R using P. Dalgaard's "Intro Stats w/ R";
>not
>> new to statistics (have had grad-level courses and work experience in
>> statistics) or vectorized programming syntax (have extensive
>experience
>> with MatLab, Python/NumPy, and IDL, and even a smidgen--a long time
>ago--of
>> experience w/ S-plus).
>> 
>> In exploring the use of is.na in the context of logical indexing,
>I've come
>> across the following puzzling-to-me result:
>> 
>>> y; !is.na(y[1:3]); y[!is.na(y[1:3])]
>> [1]  0.3534253 -1.6731597         NA -0.2079209
>> [1]  TRUE  TRUE FALSE
>> [1]  0.3534253 -1.6731597 -0.2079209
>> 
>> As you can see, y is a four element vector, the third element of
>which is
>> NA; the next line gives what I would expect--T T F--because the first
>two
>> elements are not NA but the third element is.  The third line is what
>> confuses me: why is the result not the two element vector consisting
>of
>> simply the first two elements of the vector (or, if vectorized
>indexing in
>> R is implemented to return a vector the same length as the logical
>index
>> vector, which appears to be the case, at least the first two elements
>and
>> then either NA or NaN in the third slot, where the logical indexing
>vector
>> is FALSE): why does the implementation "go looking" for an element
>whose
>> index in the "original" vector, 4, is larger than BOTH the largest
>index
>> specified in the inner-most subsetting index AND the size of the
>resulting
>> indexing vector?  (Note: at first I didn't even understand why the
>result
>> wasn't simply
>> 
>> 0.3534253 -1.6731597         NA
>> 
>> but then I realized that the third logical index being FALSE, there
>was no
>> reason for *any* element to be there; but if there is, due to some
>> overriding rule regarding the length of the result relative to the
>length
>> of the indexer, shouldn't it revert back to *something* that
>indicates the
>> "FALSE"ness of that indexing element?)
>> 
>> Thanks!
>
>It happens because R is eco-concious and re-cycles. :-)
>
>Try:
>
>ok <- c(TRUE,TRUE,FALSE)
>(1:4)[ok]
>
>In general in R if there is an operation involving two vectors then
>the shorter one gets recycled to provide sufficiently many entries to 
>match those of the longer vector.
>
>This in the foregoing example the first entry of "ok" gets used again,
>to make a length 4 vector to match up with 1:4.  The result is the same
>
>as (1:4)[c(TRUE,TRUE,FALSE,TRUE)].
>
>If you did (1:7)[ok] you'd get the same result as that from
>(1:7)[c(TRUE,TRUE,FALSE,TRUE,TRUE,FALSE,TRUE)] i.e. "ok" gets
>recycled 2 and 1/3 times.
>
>Try 10*(1:3) + 1:4, 10*(1:3) + 1:5, 10*(1:3) + 1:6 .
>
>Note that in the first two instances you get warnings, but in the third
>you don't, since 6 is an integer multiple of 3.
>
>Why aren't there warnings when logical indexing is used?  I guess 
>because it would be annoying.  Maybe.
>
>Note that integer indices get recycled too, but the recycling is
>limited 
>so as not to produce redundancies.  So
>
>(1:4)[1:3] just (sensibly) gives
>
>[1] 1 2 3
>
>and *not*
>
>[1] 1 2 3 1
>
>Perhaps a bit subtle, but it gives what you'd actually *want* rather 
>than being pedantic about rules with a result that you wouldn't want.
>
>cheers,
>
>Rolf Turner
>
>P.S.  If you do
>
>y[1:3][!is.na(y[1:3])]
>
>i.e. if you're careful to match the length of the vector and the that
>of 
>the indices, you get what you initially expected.
>
>R. T.
>
>P^2.S.  To the younger and wiser heads on this list:  the help on "[" 
>does not mention that the index vectors can be logical.  I couldn't
>find 
>anything about logical indexing in the R help files.  Is something 
>missing here, or am I just not looking in the right place?
>
>R. T.

-- 
Sent from my phone. Please excuse my brevity.



More information about the R-help mailing list