[R] working with summarized data
rdbisch at gmail.com
Wed Aug 30 16:27:58 CEST 2006
The data sets I am working with all have a weight variable--e.g.,
each row doesn't mean 1 observation.
With that in mind, nearly all of the graphs and summary statistics
are incorrect for my data, because they don't take into account the
For example "median" is incorrect, as the quantiles aren't calculated
sum( weights[X < median(X)] ) / sum(weights)
This should be 0.5... of course it's not.
Unfortunately, it seems that most(all?) of R's graphics and summary
statistic functions don't take a weight or frequency argument.
(Fortunately the models do...)
Am I completely missing how to do this? One way would be to
replicate each row proportional to the weight (e.g. if the weight was
4, we would 3 additional copies) but this will get prohibitive pretty
quickly as the dataset grows.
Thanks in advance!
More information about the R-help