[R] Weighted descriptives by levels of another variables

David Winsemius dwinsemius at comcast.net
Sun Nov 15 01:43:36 CET 2009


Have you reviewed the survey package functions?

-- 
David
On Nov 14, 2009, at 5:31 PM, Andrew Miles wrote:

> I've noticed that R has a number of very useful functions for
> obtaining descriptive statistics on groups of variables, including
> summary {stats}, describe {Hmisc}, and describe {psych}, but none that
> I have found is able to provided weighted descriptives of subsets of a
> data set (ex. descriptives for both males and females for age, where
> accurate results require use of sampling weights).
>
> Does anybody know of a function that does this?
>
> What I've looked at already:
>
> I have looked at describe.by {psych} which will give descriptives by
> levels of another variable (eg. mean ages of males and females), but
> does not accept sample weights.
>
> I have also looked at describe {Hmisc} which allows for weights, but
> has no functionality for subdivision.
>
> I tried using a by() function with describe{Hmisc}:
>
> by(cbind(my, variables, here), division.variable, describe,
> weights=weight.variable)
>
> but found that this returns an error message stating that the
> variables to be described and the weights variable are not the same
> length:
>
> Error in describe.vector(xx, nam[i], exclude.missing =
> exclude.missing,  :
>   length of weights must equal length of x
> In addition: Warning message:
> In present & !is.na(weights) :
>   longer object length is not a multiple of shorter object length
>
> This comes because the by() function passes down a subset of the
> variables to be described to describe(), but not a subset of the
> weights variable.  describe() then searches the whatever data set is
> attached in order to find the weights variables, but this is in its
> original (i.e. not subsetted) form.  Here is an example using the
> ChickWeight dataset that comes in the "datasets" package.
>
> data(ChickWeight)
> attach(ChickWeight)
> library(Hmisc)
> #this gives descriptive data on the variables "Time" and "Chick" by
> levels of "Diet")
> by(cbind(Time, Chick), Diet, describe)
> #trying to add weights, however, does not work for reasons described
> above
> wgt=rnorm(length(Chick), 12, 1)
> by(cbind(Time, Chick), Diet, describe, weights=wgt)
>
> Again, my question is, does anybody know of a function that combines
> both the ability to provided weighted descriptives with the ability to
> subdivide by the levels of some other variable?
>
>
> Andrew Miles
> Department of Sociology
> Duke University
>
>
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list