[R] question about ... questions/ code

R. Michael Weylandt michael.weylandt at gmail.com
Mon Jan 30 06:05:55 CET 2012


It's standard form to cc the list so that replies and experience get pooled.

This sounds like a problem with your data set to be honest. Can you
post a link to it? If it's not publicly available, there's probably
nothing that we (having no access to it) can do to help interpret.

If you know the structure of the excel file, the colnames argument of
read.table will let you say what you want the names to be once they're
in R: that could be of help.

Michael

On Mon, Jan 30, 2012 at 12:02 AM, Nicole Marie Ford <nmford at uwm.edu> wrote:
> Michael,
>
> I am sorry if I was not clear.
>
> The code book is a list of questions (over 800) from the Russia Barometer survey from 2009.  I am sure it exists somewhere in hardcopy, however I downloaded it from UK archive, and is in Excel.
>
> Other datasets I have used list their variables.  Further, they tend to be intuitive, meaning, for example, question 9 is listed as variable Q9 if I were to do names(dataset),  etc.  I recode them later for my own use.  But this layout makes it easier to find which named variable in the dataset goes to which question.  Clearly, that is not what is happening here.
>
> Does that make sense?
>
> ~Nicole
>
>
>
> ----- Original Message -----
> From: "R. Michael Weylandt" <michael.weylandt at gmail.com>
> To: "Nicole Marie Ford" <nmford at uwm.edu>
> Cc: "r-help" <r-help at r-project.org>
> Sent: Sunday, January 29, 2012 9:51:37 PM
> Subject: Re: [R] question about ... questions/ code
>
> Perhaps I'm misunderstanding you, but it doesn't sound like this is
> much of an R question at all: what is "the code book"? If it's an
> actual (dead tree) book, I don't think there's much you can do in R to
> automate identifications; if it's an online API, you *might* be able
> to rig a matching algorithm.
>
> Still, hopefully if you say a little more about your data source, it
> might be possible to help -- it doesn't sound like a pleasant
> situation at all...
>
> Michael
>
> Here's a (very untested) shot at matching known levels to columns of
> levels, but it's assuming there's going to be a perfect match, and
> that your book encodes them the same way as the data source etc.
>
> LEVS <- lapply(RB09, levels)
> thingToMatch = c("A", "b", "c")
> which(sapply(LEVS, function(x,y) all(match(x,y, nomatch = 0L) > 0L),
> thingToMatch))
>
>
> On Sun, Jan 29, 2012 at 12:31 PM, Nicole Marie Ford <nmford at uwm.edu> wrote:
>> Hello,
>>
>> I have a dataset which I am calling RB09.
>>
>> I am trying to match the questions in the code book with variable codes.
>>
>> It is not very intuitive.
>>
>> example:
>>
>>  names(RB09)
>>  [1] "ea1"        "eaf1"       "eaf1a"      "eaf2"       "eaf2_7"
>>  [6] "eaf3"       "eafimpun"   "eafunpun"   "evimpmar"   "evfutpro"
>>  [11] "ecjoh"      "eaf4a"      "eaf5"       "eaf6a"      "eaf6b"
>>  [16] "eaf6c"      "eaf6d"      "eaf6e"      "eaf7a"      "eaf7b"
>>
>> (there are over 800 of these)
>>
>> questions looks like this:
>>
>> B16a. Most people in this country
>>        Trusts
>>        Neutral
>>        Does not trust
>> however, there is no variable B16a.  there is one that is "ssb16a" but as you can see:
>>
>>> levels(RB09$ss16a)
>> [1] "yes"       "no"        "dont know" "na"        "dk"
>>
>> The levels are not the same.  SO I don't think this is correctly matched.
>>
>> Is there an easy way to find out what -for example- which question "eaf6c"
>> goes to?
>>
>> Also, I know there is a way to search key words and find the variables which match.  I have done this before and can't find the code.
>>
>> Any direction would help.  Thanks.
>>
>> ~Nicole
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list