[R] Indexing by logical vectors

Christian Raschke crasch2 at tigers.lsu.edu
Tue Jul 20 07:12:05 CEST 2010


On Tue, 2010-07-20 at 10:12 +1000, Bill.Venables at csiro.au wrote:
> As far as I know the answer to your question is "No", but there are things you can do to improve the readability of your code.  One thing I find useful is to avoid using "$" as much as possible and to favour things like with() and within().
> 

Thank you for your answer. I had not looked at within() for this until
now.

> The first thing you might do is think about choosing shorter names, of course.  If that's not possible, you could try something like this.
> 
> ensureNN <- function(x) {  # "ensure non-negative"
> 	is.na(x[x < 0]) <- TRUE
> 	x
> } 

This approach would essentially require a different function for the
different operations to be performed on the vector. I suppose that
assigning NA based on a certain condition is probably the most common
use, but in the end I have other cases, where the logical vector is
obtained from other operations or where the value that is assigned is
different case by case; for example,

levels(something.long)[levels(something.long) %in% LETTERS[1:3]] <- "Z"

So given that your general answer above to my inquiry was "No", I will
keep experimenting and I'll also have another look at with() and
within(). 

Thanks again!


> 
> some.data.frame <- within(some.data.frame, {
>   some.long.variable.name <- ensureNN(some.long.variable.name)
>   some.other.long.variable.name <- ensureNN(some.other.long.variable.name)
> })
> 
> Of course if you wanted to do this to all variables in a data frame you could do
> 
> some.data.frame <- data.frame(lapply(some.data.frame, ensureNN))
> 
> and it all happens, no questions asled.  (I can see a generic function emerging here, perhaps...)
> 
> W.
> 
> 
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Christian Raschke
> Sent: Tuesday, 20 July 2010 9:16 AM
> To: r-help at r-project.org
> Subject: [R] Indexing by logical vectors
> 
> Dear R-Listers,
> 
> My question concerns indexing vectors by logical vectors that are based 
> on the original vector. Consider the following simple example to 
> hopefully make clear what I mean:
> 
> a <- rnorm(10)
> a[a<0] <- NA
> 
> However, I am now working with multiple data frames that I received, 
> where each of them has nicely descriptive, yet long names(). In my 
> scripts there are many instances where operations similar to the one 
> above are required. Again a simple example:
> 
> 
> some.data.frame <- data.frame(some.long.variable.name=rnorm(10), 
> some.other.long.variable.name=rnorm(10))
> 
> some.data.frame$some.other.long.variable.name[some.data.frame$some.other.long.variable.name 
> < 0] <- NA
> 
> 
> The fact that the names are so long makes things not very readable in 
> the script and hard to debug. Is there a way in R to refer to the "self" 
> of whatever is being indexed? I am looking for something like
> 
> some.data.frame$some.other.long.variable.name[.self < 0] <- NA
> 
> that would accomplish the same result as above. Or is there another 
> concise, but less messy way to do this? I prefer not attaching the 
> data.frames and partial matching makes things even more messy since many 
> names() are very similar. I know I could just rename everything, but I'd 
> like to learn if there is and easy or obvious way to do this in R that I 
> have missed so far.
> 
> I would appreciate any advice, and I apologize if this topic has been 
> discussed before.
> 
> 
>  > sessionInfo()
> R version 2.11.0 (2010-04-22)
> x86_64-redhat-linux-gnu
> 
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
>



More information about the R-help mailing list