[R] tapply error svyby function "survey" package

Anthony Damico ajdamico at gmail.com
Wed Nov 12 23:39:00 CET 2014


hi martin, sending the first 25 rows does not help if it does not re-create
the problem..  when i run the data you have provided, i do not encounter
your problem (see below).  someone else may be able to guess the issue, but
this would be a lot easier to solve if you can create a minimal
reproducible example

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example


sii.tesis <-
structure(list(id = c(51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L,
59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L,
73L, 74L, 75L, 76L), stratum = structure(c(1L, 4L, NA, 4L, 4L,
1L, 6L, NA, 4L, 4L, 1L, 1L, 1L, 6L, 6L, 3L, 3L, 6L, NA, 1L, 1L,
6L, 4L, 3L, 6L), .Label = c("MEst", "MAcad", "MAdm", "FEst",
"FAcad", "FAdm"), class = "factor"), expfc = c(22.8195266723633,
17.0644626617432, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633,
5.1702127456665, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633,
22.8195266723633, 22.8195266723633, 5.1702127456665, 5.1702127456665,
6.24137926101685, 6.24137926101685, 5.1702127456665, NA, 22.8195266723633,
22.8195266723633, 5.1702127456665, 17.0644626617432, 6.24137926101685,
5.1702127456665), d7 = structure(c(1L, 1L, NA, 1L, 1L, 1L, 1L,
NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, NA, 1L, 1L, 6L, 1L,
6L, 6L), .Label = c("Soltero", "Casado", "Separado", "Divorciado",
"Viudo", "Union libre"), class = "factor"), ovt = c(NA, 93.3823547363281,
NA, NA, NA, NA, 83.8235321044922, NA, NA, NA, NA, NA, NA, NA,
79.4117660522461, NA, NA, 19.1176471710205, NA, NA, NA, 85.2941207885742,
NA, NA, NA)), .Names = c("id", "stratum", "expfc", "d7", "ovt"
), row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9",
"10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20",
"21", "22", "23", "24", "25"), class = "data.frame")

 sii.design <- svydesign(
  id = ~1,
  strata = ~stratum,
  weights = ~expfc,
  data = subset(sii.tesis, !is.na(stratum)))

svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95)


# works fine---
> svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95)
                     d7      ovt       se
Soltero         Soltero 88.94329 3.333485
Casado           Casado 19.11765 0.000000
Union libre Union libre 85.29412 0.000000






On Wed, Nov 12, 2014 at 5:25 PM, Martin Canon <martin.canon at gmail.com>
wrote:

> Anthony, thanks for your reply.
>
> Resetting the levels didn't work.
>
> These are the first 25 rows of the dataset:
>
> structure(list(id = c(51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L,
> 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L,
> 73L, 74L, 75L, 76L), stratum = structure(c(1L, 4L, NA, 4L, 4L,
> 1L, 6L, NA, 4L, 4L, 1L, 1L, 1L, 6L, 6L, 3L, 3L, 6L, NA, 1L, 1L,
> 6L, 4L, 3L, 6L), .Label = c("MEst", "MAcad", "MAdm", "FEst",
> "FAcad", "FAdm"), class = "factor"), expfc = c(22.8195266723633,
> 17.0644626617432, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633,
> 5.1702127456665, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633,
> 22.8195266723633, 22.8195266723633, 5.1702127456665, 5.1702127456665,
> 6.24137926101685, 6.24137926101685, 5.1702127456665, NA, 22.8195266723633,
> 22.8195266723633, 5.1702127456665, 17.0644626617432, 6.24137926101685,
> 5.1702127456665), d7 = structure(c(1L, 1L, NA, 1L, 1L, 1L, 1L,
> NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, NA, 1L, 1L, 6L, 1L,
> 6L, 6L), .Label = c("Soltero", "Casado", "Separado", "Divorciado",
> "Viudo", "Union libre"), class = "factor"), ovt = c(NA, 93.3823547363281,
> NA, NA, NA, NA, 83.8235321044922, NA, NA, NA, NA, NA, NA, NA,
> 79.4117660522461, NA, NA, 19.1176471710205, NA, NA, NA, 85.2941207885742,
> NA, NA, NA)), .Names = c("id", "stratum", "expfc", "d7", "ovt"
> ), row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9",
> "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20",
> "21", "22", "23", "24", "25"), class = "data.frame")
>
> Regards.
>
> Martin
>
> On Wed, Nov 12, 2014 at 1:39 PM, Anthony Damico <ajdamico at gmail.com>
> wrote:
> > try resetting your levels?  if that doesn't work, please dput() an
> example
> > data set that we can test with :) thanks!
> >
> > sii.design <- update( sii.design , d6 = factor( d6 ) )
> >
> >
> >
> >
> >
> >
> > On Wed, Nov 12, 2014 at 7:59 AM, Martin Canon <martin.canon at gmail.com>
> > wrote:
> >>
> >> Hi.
> >>
> >>
> >> I'm trying to calculate the weighted mean score of a quality of life
> >> measure (ovt) in patients with irritable bowel syndrome by their
> >> marital status (d7).
> >>
> >> This is a summary of the structure of the dataset:
> >>
> >> > str(sii.tesis)
> >> 'data.frame':    1063 obs. of  75 variables:
> >>  $ id         : int  51 52 53 54 55 56 57 58 59 60 ...
> >>  $ stratum    : Factor w/ 6 levels "MEst","MAcad",..: 1 4 NA 4 4 1 6 NA
> 4
> >> 4 ...
> >>  $ expfc      : num  22.8 17.1 NA 17.1 17.1 ...
> >>  $ d6         : Factor w/ 3 levels "Estudiante","Profesor",..: 1 1 NA
> >> 1 1 1 3 NA 1 1 ...
> >>  $ d7         : Factor w/ 6 levels "Soltero","Casado",..: 1 1 NA 1 1 1
> >> 1 NA 1 1 ...
> >>  $ d7c        : Factor w/ 2 levels "No estable","Estable": 1 1 NA 1 1
> >> 1 1 NA 1 1 ...
> >>  $ s1cm       : Factor w/ 2 levels "No","Si": 1 2 NA 1 1 1 2 NA 1 1 ...
> >>  $ ovt        : num  NA 93.4 NA NA NA ...
> >>
> >> I declared the sampling design:
> >>
> >> > sii.design <- svydesign(
> >>   id = ~1,
> >>   strata = ~stratum,
> >>   weights = ~expfc,
> >>   data = subset(sii.tesis, !is.na(stratum)))
> >>
> >> Then I tried to get the result:
> >>
> >> > svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95)
> >>
> >> but i get the error:
> >>
> >> Error in tapply(1:NROW(x), list(factor(strata)), function(index) { :
> >>   arguments must have same length
> >>
> >>
> >> The length of both variables is the same. If the variable ovt exists,
> >> there is a d7 match in the data frame.
> >>
> >> I try the same thing using another variable instead - "role" (d6) -
> >> and it works.
> >>
> >> > svyby(~ovt, ~d6, sii.design, svymean, na.rm = TRUE, level = 0.95)
> >>                            d6      ovt       se
> >> Estudiante         Estudiante 71.01805 1.370569
> >> Profesor             Profesor 72.30923 6.518378
> >> Administrativo Administrativo 75.69102 3.715050
> >>
> >> If I use the recategorized d7 variable (d7c,  two levels only) it works
> >> too:
> >>
> >> > svyby(~ovt, ~d7c, sii.design, svymean, na.rm = TRUE, level = 0.95)
> >>                   d7c      ovt      se
> >> No estable No estable 70.92344 1.37460
> >> Estable       Estable 74.53719 4.16954
> >>
> >>
> >> What could be the problem?
> >>
> >>
> >> Regards.
> >>
> >>
> >> Martin Canon
> >> Colombia, South America
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list