[R] Value Labels: SPSS Dataset to R

John Kane jrkr|de@u @end|ng |rom gm@||@com
Sat Feb 8 16:38:23 CET 2020


Best of luck.

On Sat, 8 Feb 2020 at 10:36, Yawo Kokuvi <yawo1964 using gmail.com> wrote:

> Thanks again - I realized after posting that sjlabelled is indirectly
> referencing haven's read_sav function.  For a moment I thought you were
> referring to the read.spss under the older foreign package.  But then
> realized that read_sav and read_spss are equivalent. So that's clear now.
>
> And I also realized there are so many ways to do the same thing in R - so
> as part of learning, I am discovering these different ways, and knowing
> when to use one over the other.
>
> Thanks for the references - I will read further on them.
>
> cheers, cY
>
> On Sat, Feb 8, 2020 at 10:28 AM John Kane <jrkrideau using gmail.com> wrote:
>
>> "use a different function (read_spss) as John has suggested to import the
>> file. "
>>
>> No! As far as I can see sjlabelled is simply using haven"s function "
>> read_sav()" to read in the data. It is just wrapped in the "read.spss()
>> function.There should be no difference between read_sav(sdata.sav) and
>> read_spss(sdata.sav).
>>
>> It just seems to keep the code simpler (more aesthetically pleasing?) if
>> you do not load more packages than needed. Likewise you do not need to load
>> "labels" as sjlabelledis taking care of this for you.
>>
>> Oh, BTW  Scratch$sex %>% attr('labels') can be replaced by something like
>> get_labels(dat1) in my example. There usually are a multitude of ways to do
>> the same thing in R.
>>
>> You might want to have a look at
>> https://cran.r-project.org/web/packages/labelled/vignettes/intro_labelled.html
>> and https://strengejacke.github.io/sjlabelled/articles/labelleddata.html
>> for more about working with labels.
>>
>> On Sat, 8 Feb 2020 at 09:35, Yawo Kokuvi <yawo1964 using gmail.com> wrote:
>>
>>> Thanks so much for all your assistance.  I admit R's learning curve is a
>>> bit steep, but I am eager to learn ... and hopefully teach with it.
>>>
>>> with regard to my problem, I can now see two options:  either declare
>>> each categorical variable as factors, specifying the needed levels and
>>> labels.
>>>
>>> OR
>>>
>>> use a different function (read_spss) as John has suggested to import the
>>> file.
>>>
>>> I will experiment with both.
>>>
>>> With much appreciation, cY
>>>
>>> On Sat, Feb 8, 2020 at 9:25 AM John Kane <jrkrideau using gmail.com> wrote:
>>>
>>>> Hi Yawo Kokuvi;
>>>> As an R newbie transitioning from SPSS to R expect culture shock and
>>>> the possible feeling that yor brain is twisting within your skull but it is
>>>> well worth.
>>>>
>>>> Try something like this:
>>>> ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> dat1  <- structure(list(Animal = structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0,
>>>> 0), label = "Animal", labels = c(Cat = 0, Dog = 1), class =
>>>> "haven_labelled"),
>>>>     Training = structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), label = "Type
>>>> of Training", labels = c(`Food as Reward` = 0,
>>>>     `Affection as Reward` = 1), class = "haven_labelled"), Dance =
>>>> structure(c(1,
>>>>     1, 1, 1, 1, 1, 1, 1, 1, 1), label = "Did they dance?", labels =
>>>> c(No = 0,
>>>>     Yes = 1), class = "haven_labelled")), row.names = c(NA, -10L
>>>> ), class = c("tbl_df", "tbl", "data.frame"))
>>>>
>>>>
>>>> library(sjlabelled)
>>>> str(dat1)
>>>> get_labels(dat1)
>>>> barplot(table(as_label(dat1$Dance)))
>>>> ##==================================================================
>>>> Your problem sees to be omitting the as_label().
>>>>
>>>> You do not need to load "haven"
>>>> read_spss() in sjlabelled should do the trick.
>>>>
>>>>
>>>> On Sat, 8 Feb 2020 at 05:44, Rui Barradas <ruipbarradas using sapo.pt> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> Try
>>>>>
>>>>> aux_fun <- function(x){
>>>>>    levels <- attr(x, "labels")
>>>>>    factor(x, labels = names(levels), levels = levels)
>>>>> }
>>>>>
>>>>> newCatsDogs <- as.data.frame(lapply(CatsDogs, aux_fun))
>>>>>
>>>>> str(newCatsDogs)
>>>>> #'data.frame':  10 obs. of  3 variables:
>>>>> # $ Animal  : Factor w/ 2 levels "Cat","Dog": 1 1 1 1 1 1 1 1 1 1
>>>>> # $ Training: Factor w/ 2 levels "Food as Reward",..: 1 1 1 1 1 1 1 1
>>>>> 1 1
>>>>> # $ Dance   : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2
>>>>>
>>>>>
>>>>> As for the
>>>>>   - frequencies: ?table, ?tapply, ?aggregate,
>>>>>   - barplots: ?barplot
>>>>>
>>>>> You can find lots and lots of examples online of both covering what
>>>>> seems to simple use cases.
>>>>>
>>>>> Hope this helps,
>>>>>
>>>>> Rui Barradas
>>>>>
>>>>> Às 06:03 de 08/02/20, Yawo Kokuvi escreveu:
>>>>> > Thanks for all. Here is output from dput.  I used a different dataset
>>>>> > containing categorical variables since the previous one is on a
>>>>> different
>>>>> > computer.
>>>>> >
>>>>> > In the following dataset, my interest is in getting frequencies and
>>>>> > barplots for the two variables: Training and Dance, with value labels
>>>>> > displayed.
>>>>> >
>>>>> > thanks again - cY
>>>>> >
>>>>> >
>>>>> > =========
>>>>> > dput(head(CatsDogs, n = 10))
>>>>> > structure(
>>>>> >    list(
>>>>> >      Animal = structure(
>>>>> >        c(0, 0, 0, 0, 0, 0, 0, 0, 0,
>>>>> >          0),
>>>>> >        label = "Animal",
>>>>> >        labels = c(Cat = 0, Dog = 1),
>>>>> >        class = "haven_labelled"
>>>>> >      ),
>>>>> >      Training = structure(
>>>>> >        c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
>>>>> >        label = "Type of Training",
>>>>> >        labels = c(`Food as Reward` = 0,
>>>>> >                   `Affection as Reward` = 1),
>>>>> >        class = "haven_labelled"
>>>>> >      ),
>>>>> >      Dance = structure(
>>>>> >        c(1,
>>>>> >          1, 1, 1, 1, 1, 1, 1, 1, 1),
>>>>> >        label = "Did they dance?",
>>>>> >        labels = c(No = 0,
>>>>> >                   Yes = 1),
>>>>> >        class = "haven_labelled"
>>>>> >      )
>>>>> >    ),
>>>>> >    row.names = c(NA,-10L),
>>>>> >    class = c("tbl_df", "tbl", "data.frame")
>>>>> > )
>>>>> >
>>>>> >
>>>>> > On Fri, Feb 7, 2020 at 10:14 PM Bert Gunter <bgunter.4567 using gmail.com>
>>>>> wrote:
>>>>> >
>>>>> >> Yes. Most attachments are stripped by the server.
>>>>> >>
>>>>> >> Bert Gunter
>>>>> >>
>>>>> >> "The trouble with having an open mind is that people keep coming
>>>>> along and
>>>>> >> sticking things into it."
>>>>> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>>> >>
>>>>> >>
>>>>> >> On Fri, Feb 7, 2020 at 5:34 PM John Kane <jrkrideau using gmail.com>
>>>>> wrote:
>>>>> >>
>>>>> >>> Hi,
>>>>> >>> Could you upload some sample data in dput form?  Something like
>>>>> >>> dput(head(Scratch, n=13)) will give us some real data to examine.
>>>>> Just
>>>>> >>> copy
>>>>> >>> and paste the output of dput(head(Scratch, n=13))into the email.
>>>>> This is
>>>>> >>> the best way to ensure that R-help denizens are getting the data
>>>>> in the
>>>>> >>> exact format that you have.
>>>>> >>>
>>>>> >>> On Fri, 7 Feb 2020 at 15:32, Yawo Kokuvi <yawo1964 using gmail.com>
>>>>> wrote:
>>>>> >>>
>>>>> >>>> Thanks for all your assistance
>>>>> >>>>
>>>>> >>>> Attached please is the Rdata scratch I have been using
>>>>> >>>>
>>>>> >>>> -----------------------------------------------------
>>>>> >>>>
>>>>> >>>>> head(Scratch, n=13)
>>>>> >>>> # A tibble: 13 x 6
>>>>> >>>>        ID           marital        sex      race    paeduc
>>>>> speduc
>>>>> >>>>     <dbl>         <dbl+lbl>  <dbl+lbl> <dbl+lbl> <dbl+lbl>
>>>>> <dbl+lbl>
>>>>> >>>>   1     1 3 [DIVORCED]      1 [MALE]   1 [WHITE]        NA
>>>>> NA
>>>>> >>>>   2     2 1 [MARRIED]       1 [MALE]   1 [WHITE]        NA
>>>>> NA
>>>>> >>>>   3     3 3 [DIVORCED]      1 [MALE]   1 [WHITE]         4
>>>>> NA
>>>>> >>>>   4     4 4 [SEPARATED]     1 [MALE]   1 [WHITE]        16
>>>>> NA
>>>>> >>>>   5     5 3 [DIVORCED]      1 [MALE]   1 [WHITE]        18
>>>>> NA
>>>>> >>>>   6     6 1 [MARRIED]       2 [FEMALE] 1 [WHITE]        14
>>>>> 20
>>>>> >>>>   7     7 1 [MARRIED]       2 [FEMALE] 2 [BLACK]        NA
>>>>> 12
>>>>> >>>>   8     8 1 [MARRIED]       2 [FEMALE] 1 [WHITE]        NA
>>>>> 12
>>>>> >>>>   9     9 3 [DIVORCED]      2 [FEMALE] 1 [WHITE]        11
>>>>> NA
>>>>> >>>> 10    10 1 [MARRIED]       2 [FEMALE] 1 [WHITE]        16
>>>>> 12
>>>>> >>>> 11    11 5 [NEVER MARRIED] 2 [FEMALE] 2 [BLACK]        NA
>>>>> NA
>>>>> >>>> 12    12 3 [DIVORCED]      2 [FEMALE] 2 [BLACK]        NA
>>>>> NA
>>>>> >>>> 13    13 3 [DIVORCED]      2 [FEMALE] 2 [BLACK]        16
>>>>> NA
>>>>> >>>>
>>>>> >>>> -----------------------------------------------------
>>>>> >>>>
>>>>> >>>> and below is my script/command file.
>>>>> >>>>
>>>>> >>>> *#1: Load library and import SPSS dataset*
>>>>> >>>> library(haven)
>>>>> >>>> Scratch <- read_sav("~/Desktop/Scratch.sav")
>>>>> >>>>
>>>>> >>>> *#2: save the dataset with a name*
>>>>> >>>> save(ScratchImport, file="Scratch.Rdata")
>>>>> >>>>
>>>>> >>>> *#3: install & load necessary packages for descriptive statistics*
>>>>> >>>> install.packages ("freqdist")
>>>>> >>>> library (freqdist)
>>>>> >>>>
>>>>> >>>> install.packages ("sjlabelled")
>>>>> >>>> library (sjlabelled)
>>>>> >>>>
>>>>> >>>> install.packages ("labelled")
>>>>> >>>> library (labelled)
>>>>> >>>>
>>>>> >>>> install.packages ("surveytoolbox")
>>>>> >>>> library (surveytoolbox)
>>>>> >>>>
>>>>> >>>> *#4: Check the value labels of gender and marital status*
>>>>> >>>> Scratch$sex %>% attr('labels')
>>>>> >>>> Scratch$marital %>% attr('labels')
>>>>> >>>>
>>>>> >>>> *#5:  Frequency Distribution and BarChart for Categorical/Ordinal
>>>>> Level
>>>>> >>>> Variables such as Gender - SEX*
>>>>> >>>> freqdist(Scratch$sex)
>>>>> >>>> barplot(table(Scratch$marital))
>>>>> >>>>
>>>>> >>>> -----------------------------------------------------
>>>>> >>>>
>>>>> >>>> As you can see from above, I use the <haven> package to import
>>>>> the data
>>>>> >>>> from SPSS.  Apparently, the haven function keeps the value
>>>>> labels, as
>>>>> >>> the
>>>>> >>>> attribute options in section #4 of my script shows.
>>>>> >>>> The problem is that when I run frequency distribution for any of
>>>>> the
>>>>> >>>> categorical variables like sex or marital status, only the
>>>>> numbers (1,
>>>>> >>> 2,)
>>>>> >>>> are displayed in the output.  The labels (male, female) for
>>>>> example are
>>>>> >>>> not.
>>>>> >>>>
>>>>> >>>> Is there any way to force these to be shown in the output?  Is
>>>>> there a
>>>>> >>>> global property that I have to set so that these value labels are
>>>>> >>> reliably
>>>>> >>>> displayed with every output?  I read I can declare them as
>>>>> factors using
>>>>> >>>> the <as_factor()>, but once I do so, how do I invoke them in my
>>>>> >>> commands so
>>>>> >>>> that the value labels show...
>>>>> >>>>
>>>>> >>>> Sorry about all the noobs questions, but Ihopefully, I am able to
>>>>> get
>>>>> >>> this
>>>>> >>>> working.
>>>>> >>>>
>>>>> >>>> Thanks in advance.
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> Thanks - cY
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> On Fri, Feb 7, 2020 at 1:14 PM <cpolwart using chemo.org.uk> wrote:
>>>>> >>>>
>>>>> >>>>> I've never used it, but there is a labels function in haven...
>>>>> >>>>>
>>>>> >>>>> On 7 Feb 2020 17:05, Bert Gunter <bgunter.4567 using gmail.com> wrote:
>>>>> >>>>>
>>>>> >>>>> What does your data look like after importing? -- see ?head and
>>>>> ?str
>>>>> >>> to
>>>>> >>>>> tell us. Show us the code that failed to provide "labels." See
>>>>> the
>>>>> >>>> posting
>>>>> >>>>> guide below for how to post questions that are likely to elicit
>>>>> >>> helpful
>>>>> >>>>> responses.
>>>>> >>>>>
>>>>> >>>>> I know nothing about the haven package, but see ?factor or go
>>>>> through
>>>>> >>> an
>>>>> >>>> R
>>>>> >>>>> tutorial or two to learn about factors, which may be part of the
>>>>> issue
>>>>> >>>>> here. R *generally* obtains whatever "label" info it needs from
>>>>> the
>>>>> >>>> object
>>>>> >>>>> being tabled -- see ?tabulate, ?table etc. -- if that's what
>>>>> you're
>>>>> >>>> doing.
>>>>> >>>>>
>>>>> >>>>> Bert Gunter
>>>>> >>>>>
>>>>> >>>>> "The trouble with having an open mind is that people keep coming
>>>>> along
>>>>> >>>> and
>>>>> >>>>> sticking things into it."
>>>>> >>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
>>>>> )
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> On Fri, Feb 7, 2020 at 8:28 AM Yawo Kokuvi <yawo1964 using gmail.com>
>>>>> >>> wrote:
>>>>> >>>>>
>>>>> >>>>>> Hello,
>>>>> >>>>>>
>>>>> >>>>>> I am just transitioning from SPSS to R.
>>>>> >>>>>>
>>>>> >>>>>> I used the haven library to import some of my spss data files
>>>>> to R.
>>>>> >>>>>>
>>>>> >>>>>> However, when I run procedures such as frequencies or crosstabs,
>>>>> >>> value
>>>>> >>>>>> labels for categorical variables such as gender (1=male,
>>>>> 2=female)
>>>>> >>> are
>>>>> >>>>> not
>>>>> >>>>>> shown. The same applies to many other output.
>>>>> >>>>>>
>>>>> >>>>>> I am confused.
>>>>> >>>>>>
>>>>> >>>>>> 1. Is there a global setting that I can use to force all
>>>>> categorical
>>>>> >>>>>> variables to display labels?
>>>>> >>>>>>
>>>>> >>>>>> 2. Or, are these labels to be set for each function or package?
>>>>> >>>>>>
>>>>> >>>>>> 3. How can I request the value labels for each function I run?
>>>>> >>>>>>
>>>>> >>>>>> Thanks in advance for your help..
>>>>> >>>>>>
>>>>> >>>>>> Best, Yawo
>>>>> >>>>>>
>>>>> >>>>>>          [[alternative HTML version deleted]]
>>>>> >>>>>>
>>>>> >>>>>> ______________________________________________
>>>>> >>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more,
>>>>> see
>>>>> >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> >>>>>> PLEASE do read the posting guide
>>>>> >>>>>> http://www.R-project.org/posting-guide.html
>>>>> >>>>>> and provide commented, minimal, self-contained, reproducible
>>>>> code.
>>>>> >>>>>>
>>>>> >>>>>
>>>>> >>>>> [[alternative HTML version deleted]]
>>>>> >>>>>
>>>>> >>>>> ______________________________________________
>>>>> >>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more,
>>>>> see
>>>>> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> >>>>> PLEASE do read the posting guide
>>>>> >>>>> http://www.R-project.org/posting-guide.html
>>>>> >>>>> and provide commented, minimal, self-contained, reproducible
>>>>> code.
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>
>>>>> >>>>          [[alternative HTML version deleted]]
>>>>> >>>>
>>>>> >>>> ______________________________________________
>>>>> >>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> >>>> PLEASE do read the posting guide
>>>>> >>>> http://www.R-project.org/posting-guide.html
>>>>> >>>> and provide commented, minimal, self-contained, reproducible code.
>>>>> >>>>
>>>>> >>>
>>>>> >>>
>>>>> >>> --
>>>>> >>> John Kane
>>>>> >>> Kingston ON Canada
>>>>> >>>
>>>>> >>>          [[alternative HTML version deleted]]
>>>>> >>>
>>>>> >>> ______________________________________________
>>>>> >>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> >>> PLEASE do read the posting guide
>>>>> >>> http://www.R-project.org/posting-guide.html
>>>>> >>> and provide commented, minimal, self-contained, reproducible code.
>>>>> >>>
>>>>> >>
>>>>> >
>>>>> >       [[alternative HTML version deleted]]
>>>>> >
>>>>> > ______________________________________________
>>>>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> > PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> > and provide commented, minimal, self-contained, reproducible code.
>>>>> >
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>>
>>>> --
>>>> John Kane
>>>> Kingston ON Canada
>>>>
>>>
>>
>> --
>> John Kane
>> Kingston ON Canada
>>
>

-- 
John Kane
Kingston ON Canada

	[[alternative HTML version deleted]]



More information about the R-help mailing list