[R] working with summarized data
ggrothendieck at gmail.com
Wed Aug 30 16:50:24 CEST 2006
In each case, look around (help.search,
RSiteSearch) to see if you can find a function
that handles weights. For the case you mention,
medians, it can be done via quantile regression:
x <- w <- 1:5
coef(rq(x ~ 1, weight = w))
On 8/30/06, Rick Bischoff <rdbisch at gmail.com> wrote:
> The data sets I am working with all have a weight variable--e.g.,
> each row doesn't mean 1 observation.
> With that in mind, nearly all of the graphs and summary statistics
> are incorrect for my data, because they don't take into account the
> For example "median" is incorrect, as the quantiles aren't calculated
> with weights:
> sum( weights[X < median(X)] ) / sum(weights)
> This should be 0.5... of course it's not.
> Unfortunately, it seems that most(all?) of R's graphics and summary
> statistic functions don't take a weight or frequency argument.
> (Fortunately the models do...)
> Am I completely missing how to do this? One way would be to
> replicate each row proportional to the weight (e.g. if the weight was
> 4, we would 3 additional copies) but this will get prohibitive pretty
> quickly as the dataset grows.
> Thanks in advance!
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help