[R] "Survey" package and NAMCS data... unsure of specification

Thomas Lumley tlumley at u.washington.edu
Wed Oct 5 01:21:14 CEST 2005


On Tue, 4 Oct 2005, David L. Van Brunt, Ph.D. wrote:

> Hello, all.
>
> I wanted to use the "survey" package to analyze data from the National
> Ambulatory Medical Care Survey, and am having some difficulty translating
> the analysis keywords from one package (Stata) to the other (R). The data
> were collected using a multistage probability sampling, and there are
> variables included to identify the sampling units and weights. Documentation
> from the NAMCS describes this for Stata as follows (note the variable names
> in the data are in caps):
>
> The pweight (PATWT), strata (CSTRATM), and PSU (CPSUM) are set with the
> svyset command as
> follows:
> svyset pweight PATWT
> svyset strata CSTRATM
> svyset psu CPSUM
>

Supposing your data frame is called 'namcs'

dnamcs <- svydesign(id=~CPSUM, strata=~CSTRATM, weight=~PATWT, data=namcs)

or perhaps

dnamcs <- svydesign(id=~CPSUM, strata=~CSTRATM, weight=~PATWT,
                       data=namcs, nest=TRUE)

(nest=TRUE is needed if CPSUM repeats the same values in different 
strata).

Also, if you have access to design variables for the multistage design you 
can use them (but it probably won't make much difference). There's a very 
brief example using the National Health Interview Study at
  http://faculty.washington.edu/tlumley/survey/example-twostage.html


 	-thomas




More information about the R-help mailing list