[R] large survey data set

Thu Jun 27 20:35:51 CEST 2002

The lm function (for linear modelling aka linear regression) includes
case weights with a simple syntax:

foo<-lm(dependent ~ indep + indep + ... ,
	data = <data object>,
	weights = <weight variable>)

nlme uses a slightly different syntax that I don't fully understand but
for which I've successfully used the following syntax:

foo<-lme(random = 1 | <nesting var>,
	fixed = dependent ~ indep + indep + ... ,
	weights = ~1/<weight variable>,
	data = <data object>)

Hope this helps.

----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists at perrin.socsci.unc.edu * andrew_perrin (at) unc.edu

On Thu, 27 Jun 2002 rpietro at duke.edu wrote:

> Thanks for the reply. By "handle" i mean to adjust the standard errors so
> that i can draw inferences about the target population. In my case, my
> data set represents a 20% of the US population, and adjusting for the
> weights my confidence intervals are corrected for the US population rather
> than the local data set.
> 
> On Thu, 27 Jun 2002, Andrew Perrin wrote:
> 
> > Data management isn't all that easy in R, but it's certainly
> > possible.  There are some good tips here:
> > http://cran.r-project.org/doc/contrib/usingR.pdf
> >
> > You can also put the data in a database and access them with one of the
> > database interfaces available (RODBC, RMySQL, RPgSQL).
> >
> > I'm not sure what you mean by "handle" weights, clusters, and
> > strata. I'm guessing you want to use the nlme package, but it really
> > depends on what questions you want to ask.
> >
> > Best,
> > Andy Perrin
> >
> > ----------------------------------------------------------------------
> > Andrew J Perrin - http://www.unc.edu/~aperrin
> > Assistant Professor of Sociology, U of North Carolina, Chapel Hill
> > clists at perrin.socsci.unc.edu * andrew_perrin (at) unc.edu
> >
> >
> > On Thu, 27 Jun 2002 rpietro at duke.edu wrote:
> >
> > >
> > >
> > > ---------- Forwarded message ----------
> > > Hello,
> > >
> > >
> > > I am analyzing a weighted, stratified, clustered survey data set
> > > with approximately 1 million observations and 50 variables.
> > >
> > > I am new to R (I'm a Stata user), and so far
> > > couldn't find any documentation on how to handle survey data. In
> > > other words, is there a specific package to handle a combination of
> > > weigths, clusters and strata. I am also struggling to
> > > handle such a large data set. Any suggestions and/or references regarding
> > > these two issues would be greatly appreciated.
> > >
> > >
> > > Rick
> > >
> > >
> > >
> > >
> > > Ricardo Pietrobon, MD
> > > Assistant Professor of Surgery
> > > Duke University Medical Center
> > >
> > >
> > >
> > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> > > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> > > Send "info", "help", or "[un]subscribe"
> > > (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> > > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> > >
> >
> >
> >
> 
> 

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._