[R] Percentiles for unequal probability sample

Roger Koenker rkoenker at illinois.edu
Thu Nov 21 03:21:26 CET 2013


You could try:

require(quantreg)
qs <- rq(x ~ 1, weights = w, tau = 1:3/4)


Roger Koenker
rkoenker at illinois.edu




On Nov 20, 2013, at 4:56 PM, David Winsemius wrote:

> 
> On Nov 20, 2013, at 11:35 AM, Trevor Walker wrote:
> 
>> I often work with tree data that is sampled with probability proportional
>> to size, which presents a special challenge when describing the frequency
>> distribution.  For example, R functions like quantile() and fitdistr()
>> expect each observation to have equal sample probability.  As a workaround,
>> I have been "exploding"/"mushrooming" my data based on the appropriate
>> expansion factors.  However, this can take a LONG TIME and I am reaching
>> out for more efficient suggestions, particularly for the quantile()
>> function.  Example of my workaround:
>> 
> 
> The 'Hmisc' package has a `wtd.quantile` function. I seem to remember that it might have been borrowed from the quantreg package.
> 
> 
>> # trees.df represents random sample with probability proportional to size
>> (of diameter) using "basal area factor" of 20
>> trees.df <- data.frame(Diameter=rnorm(10, mean=10, sd=2),
>> TreesPerAcre=numeric(10))
>> trees.df$TreesPerAcre <- 20/(trees.df$Diameter^2*pi/576)    # expansion
>> factor for each observation
>> 
>> # to obtain percentiles that are weighted by trees per acre, "explode"
>> diameter data
>> explodeFactor <- 10 # represents ten acres
>> treeCount <- sum(round(trees.df$TreesPerAcre*explodeFactor ))
>> explodedDiameters.df <- data.frame(Diameter=numeric(treeCount))
>> k=0 # initialize counter k
>> for (i in 1:length(trees.df$Diameter)){
>> for (j in 1:round(trees.df$TreesPerAcre[i]*explodeFactor)){
>>   k <- k +1
>>   explodedDiameters.df$Diameter[k] <- trees.df$Diameter[i]
>>  }
>> }
>> 
>> quantile(explodedDiameters.df$Diameter) # appropriate percentiles (for
>> trees per acre)
>> quantile(trees.df$Diameter)             # percentiles biased upwards
>> 
>> 
>> 
>> Trevor Walker
>> 
> -- 
> 
> David Winsemius
> Alameda, CA, USA
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list