[R] Q re: logical indexing with is.na

Richard M. Heiberger rmh @end|ng |rom temp|e@edu
Sun Mar 10 03:30:03 CET 2019


>From ?Arithmetic
the elements of shorter
     vectors are recycled as necessary (with a ‘warning’ when they are
     recycled only _fractionally_).

> tmp <- !is.na(y[1:3])
> tmp
[1]  TRUE  TRUE FALSE
> c(tmp, tmp)
[1]  TRUE  TRUE FALSE  TRUE  TRUE FALSE
> c(tmp, tmp)[1:4]
[1]  TRUE  TRUE FALSE  TRUE
>  y[c(tmp, tmp)[1:4]]
[1]  0.3534253 -1.6731597 -0.2079209
>

The behavior is as documented.  I am surprised that there is no
warning about partial recycling.

On Sat, Mar 9, 2019 at 9:03 PM David Goldsmith
<eulergaussriemann using gmail.com> wrote:
>
> Hi!  Newbie (self-)learning R using P. Dalgaard's "Intro Stats w/ R"; not
> new to statistics (have had grad-level courses and work experience in
> statistics) or vectorized programming syntax (have extensive experience
> with MatLab, Python/NumPy, and IDL, and even a smidgen--a long time ago--of
> experience w/ S-plus).
>
> In exploring the use of is.na in the context of logical indexing, I've come
> across the following puzzling-to-me result:
>
> > y; !is.na(y[1:3]); y[!is.na(y[1:3])]
> [1]  0.3534253 -1.6731597         NA -0.2079209
> [1]  TRUE  TRUE FALSE
> [1]  0.3534253 -1.6731597 -0.2079209
>
> As you can see, y is a four element vector, the third element of which is
> NA; the next line gives what I would expect--T T F--because the first two
> elements are not NA but the third element is.  The third line is what
> confuses me: why is the result not the two element vector consisting of
> simply the first two elements of the vector (or, if vectorized indexing in
> R is implemented to return a vector the same length as the logical index
> vector, which appears to be the case, at least the first two elements and
> then either NA or NaN in the third slot, where the logical indexing vector
> is FALSE): why does the implementation "go looking" for an element whose
> index in the "original" vector, 4, is larger than BOTH the largest index
> specified in the inner-most subsetting index AND the size of the resulting
> indexing vector?  (Note: at first I didn't even understand why the result
> wasn't simply
>
> 0.3534253 -1.6731597         NA
>
> but then I realized that the third logical index being FALSE, there was no
> reason for *any* element to be there; but if there is, due to some
> overriding rule regarding the length of the result relative to the length
> of the indexer, shouldn't it revert back to *something* that indicates the
> "FALSE"ness of that indexing element?)
>
> Thanks!
>
> DLG
>
> > sessionInfo()
> R version 3.5.2 (2018-12-20)
> Platform: x86_64-apple-darwin15.6.0 (64-bit)
> Running under: macOS High Sierra 10.13.6
>
> Matrix products: default
> BLAS:
> /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
> LAPACK:
> /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] ISwR_2.0-7
>
> loaded via a namespace (and not attached):
> [1] compiler_3.5.2 tools_3.5.2
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list