[R] Unexpected behavior when giving a value to a new variable based on the value of another variable

Angel Rodriguez angel.rodriguez at matiainstituto.net
Tue Sep 2 10:10:25 CEST 2014

```Thank you for the explanation, Peter.

Angel

-----Mensaje original-----
De: peter dalgaard [mailto:pdalgd at gmail.com]
Para: Angel Rodriguez
CC: r-help
Asunto: Re: [R] Unexpected behavior when giving a value to a new variable based on the value of another variable

On 01 Sep 2014, at 13:08 , Angel Rodriguez <angel.rodriguez at matiainstituto.net> wrote:

> Thank you John, Jim, Jeff and both Davids for your answers.
>
> After trying different combinations of values for the variable samplem, it looks like if age is greater than 65, R applies the correct code 1 whatever the value of samplem, but if age is less than 65, it just copies the values of samplem to sample. I do not understand why it does so.
>

It's because indexed assignment is really (white lie alert: it's actually worse)

N\$sample <- `[<-`(`\$`(N, `sample`), index, value)

and since N\$sample isn't there from the outset, partial matching kicks in for the `\$`bit and makes the right hand side equivalent to the same thing with `samplem`. The result still gets assigned to N\$sample, but the value is the same that N\$samplem would get from

N\$samplem[N\$age >= 65] <- 1

Notice the difference if you do

> N\$sample <- NA
> N\$sample[N\$age >= 65] <- 1
> N
age samplem sample
1  67      NA      1
2  62       1     NA
3  74       1      1
4  61       1     NA
5  60       1     NA
6  55       1     NA
7  60       1     NA
8  59       1     NA
9  58      NA     NA

-pd

> In any case, Jim's syntax work very well, although I do not understand why either.
>
> Answering to Jim, I just wanted a variable that could identify individuals with some characteristics (not only age, as in this example that has been oversimplified).
>
> Best regards,
>
> Angel Rodriguez-Laso
>
>
> -----Mensaje original-----
> De: John McKown [mailto:john.archie.mckown at gmail.com]
> Enviado el: vie 29/08/2014 14:46
> Para: Angel Rodriguez
> CC: r-help
> Asunto: Re: [R] Unexpected behavior when giving a value to a new variable based on the value of another variable
>
> On Fri, Aug 29, 2014 at 3:53 AM, Angel Rodriguez
> <angel.rodriguez at matiainstituto.net> wrote:
>>
>> Dear subscribers,
>>
>> I've found that if there is a variable in the dataframe with a name very similar to a new variable, R does not give the correct values to this latter variable based on the values of a third value:
>>
>>
> <snip>
>>
>> Any clue for this behavior?
>>
> <snip>
>>
>> Thank you very much.
>>
>> Angel Rodriguez-Laso
>> Research project manager
>> Matia Instituto Gerontologico
>
> That is unusual, but appears to be documented in a section from
>
> ?`[`
>
> <quote>
> Character indices
>
> Character indices can in some circumstances be partially matched (see
> pmatch) to the names or dimnames of the object being subsetted (but
> never for subassignment). Unlike S (Becker et al p. 358)), R never
> uses partial matching when extracting by [, and partial matching is
> not by default used by [[ (see argument exact).
>
> Thus the default behaviour is to use partial matching only when
> extracting from recursive objects (except environments) by \$. Even in
> that case, warnings can be switched on by
> options(warnPartialMatchDollar = TRUE).
>
> Neither empty ("") nor NA indices match any names, not even empty nor
> missing names. If any object has no names or appropriate dimnames,
> they are taken as all "" and so match nothing.
> </quote>
>
> Note the commend about "partial matching" in the middle paragraph in
> the quote above.
>
> --
> There is nothing more pleasant than traveling and meeting new people!
> Genghis Khan
>
> Maranatha! <><
> John McKown
>
>
>
>
>
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

[[alternative HTML version deleted]]

```