[R] unexpected subset select results?

David Winsemius dwinsemius at comcast.net
Tue Aug 24 15:13:37 CEST 2010


On Aug 24, 2010, at 7:36 AM, ivo welch wrote:

> thanks, everyone.  I did not even know about transform() and with(),
> but they look quite useful.
>
> actually, what I had intended to state was that Boolean variables in
> select parts of subset statements, especially when mixed with other
> variables that are just part of the data frame, leads to unexpected
> results.  (my d statement was intended to show how easy it is to
> forget that a variable is not part of the data frame, but just part of
> the global environment.)  unless this mixed-treatment covers an
> important functional aspect, this might be better to cause a warning
> (or an error) than a silent recycling of variables.

I think you will find intense resistance to the request for a warning  
when arguments to functions like table() and subset() are given  
vectors of the correct length but are not part of a data.frame or in  
hte data argument. I suppose the subset situation might be arguably  
different than the table situation, but it is rather common practice  
to construct utility index or flag vectors that are never incorporated  
into part of the main data,frame that is being analyzed.
>
> (Related: The recycling rules are generally convenient, but can also
> be rather problematic in catching errors.  It would be nice to be able
> to turn them off.)

You do get warnings and it is possible to raise the level of action  
taken by the system to that of an error.

?options

... and take note of hte various options beginning with "warn...

... or take note of the error option and write your own.


>
> regards,
>
> /iaw
>
>
> On Tue, Aug 24, 2010 at 3:55 AM, Gavin Simpson <gavin.simpson at ucl.ac.uk 
> > wrote:
>> On Mon, 2010-08-23 at 17:51 -0400, ivo welch wrote:
>>> quizz---what does this produce?
>>
>> Henrique has provided an answer to the question, but...
>>
>>>    d=data.frame( a=1:1000, b=2001:3000, z= 5001:6000 )
>>>    attach(d); c <- (a+b)>25; detach(d)
>>
>> ...this is ugly and will potentially catch you out one day if you  
>> forget
>> to detach. These three calls can be achieved using a single with() :
>>
>> c <- with(d, (a + b) > 25)
>>
>> And the version you wanted:
>>
>> attach(d); d$c <- (a+b)>25; detach(d)
>>
>> can be done using within():
>>
>> d <- within(d, c <- (a + b) > 25)
>>
>> and with the latter, the intention is pretty clear.
>>
>> HTH
>>
>> G
>>
>>>    d= subset(d, TRUE, select=c( a, b, c ))
>>>
>>> yes, I know I have made a mistake, in that the code does not do  
>>> what I
>>> presumably would have wanted.  it does seem like unexpected  
>>> behavior,
>>> though, without an error.  there probably is some reason why this  
>>> does
>>> not ring an alarm bell...
>>>
>>> /iaw
>>> ----
>>> Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com)
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> --
>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>>  Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
>>  ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
>>  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
>>  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
>>  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list