[R] The correct way to set an element in a list to NULL? (FAQ is not clear)

Fri Dec 11 19:53:53 CET 2009

On Dec 11, 2009, at 1:20 PM, Peng Yu wrote:

> On Fri, Dec 11, 2009 at 11:51 AM, Steve Lianoglou
> <mailinglist.honeypot at gmail.com> wrote:
>> 
>> On Dec 11, 2009, at 12:36 PM, Peng Yu wrote:
>> [snip]
>> 
>>>>> What seems confusing to me is:
>>>>> even 'x[i]<-list(NULL)' and 'x[[i]]<-list(NULL)' are different, why
>>>>> x[i]<-NULL and x[[i]]<-NULL are the same?
>>>>> 
>>>>> Shouldn't the meaning of 'x[[i]]<-NULL' be defined as the set the i'th
>>>>> element NULL, rather than deleting the i'th element?
>>> 
>>> Do you have any comments on the above question?
>> 
>> Sure.
>> 
>> I think it has something to do with how memory is managed and allocated in R.  You might try to read up on it a bit ...
> 
> Which question do you refer by the first 'it'?
> 
> I have been asking a good reference on memory management in R. So far,
> no one have given me any useful information. Do you have a good
> reference?

I don't ... my remark was meant to be a joke (which is why I wrote "in all seriousness" after).

>> In all seriousness tho:
>> 
>> No, I don't really have any comment on that question.
>> 
>> The semantics of what "x[i]<-list(NULL)" vs "x[[i]]<-list(NULL)" seems quite reasonable to me ... I'm not sure what that has to do with anything.
>> 
>> I also can't comment on why x[[i]] <- NULL deletes the element (instead of setting it to NULL, like you want it to) .. it's just the way it is.
> 
> The design choice of  x[[i]] <- NULL deleting the element instead of
> setting it to NULL might increase the complexity of the code. Suppose
> that I set the i'th element of x by calling some_function(), which
> never return NULL, the following code is perfectly fine.
> 
> x[[i]] <- some_function()
> 
> However, when some_function() does return NULL, the i'th element will
> be deleted.
> 
> In this case I have to do the following. I will have to use the
> following code, when I don't know if some_function() can return NULL,
> for the sake of safety. As you can see one line of code has been
> expanded to 6 lines.
> 
> result=some_function()
> if(NULL==result) {
>  x[i] <- list(NULL)
> } else {
>  x[[i]] <- some_function()
> }

Btw: don't check null with ==, use is.null(result) instead.

Unfortunately, there are some corner cases you have to deal with. It seems there isn't much you can do.

If you're using an *apply method, you wouldn't have to worry about these details, eg:

f <- function(i) if (i %% 2 == 0) NULL else 1
> lapply(1:4, f)
[[1]]
[1] 1

[[2]]
NULL

[[3]]
[1] 1

[[4]]
NULL

Or, if you have some function that's particularly badly behaved, maybe you can wrap it and have it return something better behaved:

sfWrapper <- function(..., .bad=NA) {
  result <- some_function(...)
  if (is.null(result)) .bad else result
}

then:
x[[i]] <- sfWrapper(your, args, to, some_function, here)

You can do 1 better and generalize that function and stick it in one of your utility files and have it work for everything:

badWrapper <- functinon(func.name, ..., .bad=NA) {
  result <- do.call(func.name, list(...))
  if (is.null(result)) .bad else result
}

and

x[[i]] <- badWrapper('some_function', your, args)

Assigning NA (or some other element of your choosing) to an element in your result won't nuke it from the list. you can then choose to ignore the .bad things in your downstream analysis.

> For this reason, x[[i]]<-NULL should be defined as setting it to NULL.

Somehow I don't think this will happen.

-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact