[R] Q re: logical indexing with is.na

Izmirlian, Grant (NIH/NCI) [E] |zm|r||g @end|ng |rom m@||@n|h@gov
Mon Mar 11 18:11:48 CET 2019


logical indexing requires the logical index to be of the same length as the vector being indexed. If it is not, then the index
is wrapped to be of sufficient length. The result on line 3 is
y[c(TRUE, TRUE, FALSE, TRUE)] where the last TRUE was
originally the first component of !is.na(y[1:3])


Grant Izmirlian, Ph.D.
Mathematical Statistician
izmirlig using mail.nih.gov

Delivery Address:
9609 Medical Center Dr, RM 5E130
Rockville MD 20850

Postal Address:
BG 9609 RM 5E130 MSC 9789
9609 Medical Center Dr
Bethesda, MD 20892-9789

 ofc:  240-276-7025
 cell: 240-888-7367
  fax: 240-276-7845


________________________________
From: David Goldsmith <eulergaussriemann using gmail.com>
Sent: Saturday, March 9, 2019 8:36 PM
To: r-help using r-project.org
Subject: [R] Q re: logical indexing with is.na

Hi!  Newbie (self-)learning R using P. Dalgaard's "Intro Stats w/ R"; not
new to statistics (have had grad-level courses and work experience in
statistics) or vectorized programming syntax (have extensive experience
with MatLab, Python/NumPy, and IDL, and even a smidgen--a long time ago--of
experience w/ S-plus).

In exploring the use of is.na in the context of logical indexing, I've come
across the following puzzling-to-me result:

> y; !is.na(y[1:3]); y[!is.na(y[1:3])]
[1]  0.3534253 -1.6731597         NA -0.2079209
[1]  TRUE  TRUE FALSE
[1]  0.3534253 -1.6731597 -0.2079209

As you can see, y is a four element vector, the third element of which is
NA; the next line gives what I would expect--T T F--because the first two
elements are not NA but the third element is.  The third line is what
confuses me: why is the result not the two element vector consisting of
simply the first two elements of the vector (or, if vectorized indexing in
R is implemented to return a vector the same length as the logical index
vector, which appears to be the case, at least the first two elements and
then either NA or NaN in the third slot, where the logical indexing vector
is FALSE): why does the implementation "go looking" for an element whose
index in the "original" vector, 4, is larger than BOTH the largest index
specified in the inner-most subsetting index AND the size of the resulting
indexing vector?  (Note: at first I didn't even understand why the result
wasn't simply

0.3534253 -1.6731597         NA

but then I realized that the third logical index being FALSE, there was no
reason for *any* element to be there; but if there is, due to some
overriding rule regarding the length of the result relative to the length
of the indexer, shouldn't it revert back to *something* that indicates the
"FALSE"ness of that indexing element?)

Thanks!

DLG

> sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS:
/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK:
/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] ISwR_2.0-7

loaded via a namespace (and not attached):
[1] compiler_3.5.2 tools_3.5.2

        [[alternative HTML version deleted]]



	[[alternative HTML version deleted]]



More information about the R-help mailing list