[R] Unexpected behavior when giving a value to a new variable based on the value of another variable
angel.rodriguez at matiainstituto.net
Tue Sep 2 10:10:25 CEST 2014
Thank you for the explanation, Peter.
De: peter dalgaard [mailto:pdalgd at gmail.com]
Enviado el: lun 01/09/2014 20:10
Para: Angel Rodriguez
Asunto: Re: [R] Unexpected behavior when giving a value to a new variable based on the value of another variable
On 01 Sep 2014, at 13:08 , Angel Rodriguez <angel.rodriguez at matiainstituto.net> wrote:
> Thank you John, Jim, Jeff and both Davids for your answers.
> After trying different combinations of values for the variable samplem, it looks like if age is greater than 65, R applies the correct code 1 whatever the value of samplem, but if age is less than 65, it just copies the values of samplem to sample. I do not understand why it does so.
It's because indexed assignment is really (white lie alert: it's actually worse)
N$sample <- `[<-`(`$`(N, `sample`), index, value)
and since N$sample isn't there from the outset, partial matching kicks in for the `$`bit and makes the right hand side equivalent to the same thing with `samplem`. The result still gets assigned to N$sample, but the value is the same that N$samplem would get from
N$samplem[N$age >= 65] <- 1
Notice the difference if you do
> N$sample <- NA
> N$sample[N$age >= 65] <- 1
age samplem sample
1 67 NA 1
2 62 1 NA
3 74 1 1
4 61 1 NA
5 60 1 NA
6 55 1 NA
7 60 1 NA
8 59 1 NA
9 58 NA NA
> In any case, Jim's syntax work very well, although I do not understand why either.
> Answering to Jim, I just wanted a variable that could identify individuals with some characteristics (not only age, as in this example that has been oversimplified).
> Best regards,
> Angel Rodriguez-Laso
> -----Mensaje original-----
> De: John McKown [mailto:john.archie.mckown at gmail.com]
> Enviado el: vie 29/08/2014 14:46
> Para: Angel Rodriguez
> CC: r-help
> Asunto: Re: [R] Unexpected behavior when giving a value to a new variable based on the value of another variable
> On Fri, Aug 29, 2014 at 3:53 AM, Angel Rodriguez
> <angel.rodriguez at matiainstituto.net> wrote:
>> Dear subscribers,
>> I've found that if there is a variable in the dataframe with a name very similar to a new variable, R does not give the correct values to this latter variable based on the values of a third value:
>> Any clue for this behavior?
>> Thank you very much.
>> Angel Rodriguez-Laso
>> Research project manager
>> Matia Instituto Gerontologico
> That is unusual, but appears to be documented in a section from
> Character indices
> Character indices can in some circumstances be partially matched (see
> pmatch) to the names or dimnames of the object being subsetted (but
> never for subassignment). Unlike S (Becker et al p. 358)), R never
> uses partial matching when extracting by [, and partial matching is
> not by default used by [[ (see argument exact).
> Thus the default behaviour is to use partial matching only when
> extracting from recursive objects (except environments) by $. Even in
> that case, warnings can be switched on by
> options(warnPartialMatchDollar = TRUE).
> Neither empty ("") nor NA indices match any names, not even empty nor
> missing names. If any object has no names or appropriate dimnames,
> they are taken as all "" and so match nothing.
> Note the commend about "partial matching" in the middle paragraph in
> the quote above.
> There is nothing more pleasant than traveling and meeting new people!
> Genghis Khan
> Maranatha! <><
> John McKown
> [[alternative HTML version deleted]]
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
[[alternative HTML version deleted]]
More information about the R-help