[R] Unexpected behavior when giving a value to a new variable basedon the value of another variable

David Winsemius dwinsemius at comcast.net
Sun Aug 31 06:28:58 CEST 2014


On Aug 30, 2014, at 7:38 PM, David Winsemius wrote:

>
> On Aug 29, 2014, at 8:54 PM, David McPearson wrote:
>
>> On Fri, 29 Aug 2014 06:33:01 -0700 Jeff Newmiller <jdnewmil at dcn.davis.ca.us 
>> >
>> wrote
>>
>>> One clue is the help file for "$"...
>>>
>>> ?" $"
>>>
>>> In particular there see the discussion of character indices and  
>>> the "exact"
>>> argument.
>>>
>>
>> <...snip...>
>>>
>>> On August 29, 2014 1:53:47 AM PDT, Angel Rodriguez
>>> <angel.rodriguez at matiainstituto.net> wrote: >
>>>> Dear subscribers,
>>>>
>>>> I've found that if there is a variable in the dataframe with a name
>> <...sip...>
>>>>> N <- structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58),  
>>>>> V2 =
>>>> c(NA, 1, 1, 1, 1,1,1,1,NA)),
>>>> +                     .Names = c("age","samplem"), row.names =  
>>>> c(NA,
>>>> -9L), class = "data.frame")
>>>>> N$sample[N$age >= 65] <- 1
>>>>> N
>>>> age samplem sample
>>>> 1  67      NA      1
>>>> 2  62       1      1
>>>> 3  74       1      1
>>>> 4  61       1      1
>>>> 5  60       1      1
>>>> 6  55       1      1
>>>> 7  60       1      1
>>>> 8  59       1      1
>>>> 9  58      NA     NA
>> <...snip...>
>>
>> Having seen all the responses about partial matching I almost  
>> understand. I've
>> also replicated the behaviour on R 2.11.1 so it's been around  
>> awhile. This
>> tells me it ain't a bug - so if any of the cognoscenti have the  
>> time and
>> inclination can someone give me a brief (and hopefully simple)  
>> explanation of
>> what is going on under the hood?
>>
>> It looks (to me) like N$sample[N$age >= 65] <- 1 copies N$samplem  
>> to N$sample
>> and then does the assignment. If partial matching is the problem  
>> (which it
>> clearly is) my expectation is that  the  output should look like
>>
>>  age samplem
>> 1   67       1
>> 2   62       1
>> 3   74       1
>> 4   61       1
>> 5   60       1
>> 6   55       1
>> 7   60       1
>> 8   59       1
>> 9   58      NA
>> That is - no new column.
>> (and I just hate it when the world doesn't live up to my  
>> expectations!)
>
> Not sure what you are seeing. I am seeing what you expected:
>
> > test <- data.frame(age=1:10, sample=1)
> > test$sample[test$age<5] <- 2
> > test
>   age sample
> 1    1      2
> 2    2      2
> 3    3      2
> 4    4      2
> 5    5      1
> 6    6      1
> 7    7      1
> 8    8      1
> 9    9      1
> 10  10      1


I realized later that I had not constructed a test of you behavior and  
that when I did I see the creation of a third column. The answer is to  
read the help page:

?`[<-`

"Character indices can in some circumstances be partially matched (see  
pmatch) to the names or dimnames of the object being subsetted (but  
never for subassignment). "

Note the caveat in parentheses.

-- 

David Winsemius, MD
Alameda, CA, USA



More information about the R-help mailing list