[R] odd behaviour in quantreg::rq

Dylan Beaudette debeaudette at ucdavis.edu
Wed Jul 1 21:48:22 CEST 2009


Thanks Roger. Your comments were very helpful. Unfortunately, each of 
the 'groups' in this example are derived from the same set of data, two of 
which were subsets-- so it is not that unlikely that the weighted medians 
were the same in some cases.

This all leads back to an operation attempting to answer the question:

Of the 2 subsetting methods, which one produces a distribution most like the 
complete data set? Since the distributions are not normal, and there are 
area-weights involved others on the list suggested quantile-regression. For a 
more complete picture of how 'different' the distributions are, I have tried 
looking at the differences between weighted quantiles: (0.1, 0.25, 0.5, 0.75, 
0.9) as a more complete 'description' of each distribution.

I imagine that there may be a better way to perform this comparison...

Cheers,
Dylan


On Tuesday 30 June 2009, roger koenker wrote:
> Admittedly this seemed quite peculiar....  but if you look at the
> entrails
> of the following code you will see that with the weights the first and
> second levels of your x$method variable have the same (weighted) median
> so the contrast that you are estimating SHOULD be zero.  Perhaps
> there is something fishy about the data construction that would have
> allowed us to anticipate this?  Regarding the "fn" option, and the
> non-uniqueness warning,  this is covered in the (admittedly obscure)
> faq on quantile regression available at:
>
> 	http://www.econ.uiuc.edu/~roger/research/rq/FAQ
>
> # example:
> library(quantreg)
>
> # load data
> x <- read.csv(url('http://169.237.35.250/~dylan/temp/test.csv'))
>
> # with weights
> summary(rq(sand ~ method, data=x, weights=area_fraction, tau=0.5),
> se='ker')
>
> #Reproduction with more convenient notation:
>
> X0 <- model.matrix(~method, data = x)
> y <- x$sand
> w <- x$area_fraction
> f0 <- summary(rq(y ~ X0 - 1, weights = w),se = "ker")
>
> #Second reproduction with orthogonal design:
>
> X1 <- model.matrix(~method - 1, data = x)
> f1 <- summary(rq(y ~ X1 - 1, weights = w),se = "ker")
>
> #Comparing f0 and f1 we see that they are consistent!!  How can that
> be??
> #Since the columns of X1 are orthogonal estimation of the 3 parameters
> are separable
> #so we can check to see whether the univariate weighted medians are
> reproducible.
>
> s1 <- X1[,1] == 1
> s2 <- X1[,2] == 1
> g1 <- rq(y[s1] ~ X1[s1,1] - 1, weights = w[s1])
> g2 <- rq(y[s2] ~ X1[s2,2] - 1, weights = w[s2])
>
> #Now looking at the g1 and g2 objects we see that they are equal and
> agree with f1.
>
>
> url:    www.econ.uiuc.edu/~roger            Roger Koenker
> email    rkoenker at uiuc.edu            Department of Economics
> vox:     217-333-4558                University of Illinois
> fax:       217-244-6678                Urbana, IL 61801
>
> On Jun 30, 2009, at 3:54 PM, Dylan Beaudette wrote:
> > Hi,
> >
> > I am trying to use quantile regression to perform weighted-
> > comparisons of the
> > median across groups. This works most of the time, however I am
> > seeing some
> > odd output in summary(rq()):
> >
> > Call: rq(formula = sand ~ method, tau = 0.5, data = x, weights =
> > area_fraction)
> > Coefficients:
> >                   Value    Std. Error t value  Pr(>|t|)
> > (Intercept)        45.44262  3.64706   12.46007  0.00000
> > methodmukey-HRU     0.00000  4.67115    0.00000  1.00000
> > 				  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> > When I do not include the weights, I get something a little closer
> > to a
> > weighted comparison of means, along with an error message:
> >
> > Call: rq(formula = sand ~ method, tau = 0.5, data = x)
> > Coefficients:
> >                   Value    Std. Error t value  Pr(>|t|)
> > (Intercept)        44.91579  2.46341   18.23318  0.00000
> > methodmukey-HRU     9.57601  9.29348    1.03040  0.30380
> > Warning message:
> > In rq.fit.br(x, y, tau = tau, ...) : Solution may be nonunique
> >
> >
> > I have noticed that the error message goes away when specifying
> > method='fn' to
> > rq(). An example is below. Could this have something to do with
> > replication
> > in the data?
> >
> >
> > # example:
> > library(quantreg)
> >
> > # load data
> > x <- read.csv(url('http://169.237.35.250/~dylan/temp/test.csv'))
> >
> > # with weights
> > summary(rq(sand ~ method, data=x, weights=area_fraction, tau=0.5),
> > se='ker')
> >
> > # without weights
> > # note error message
> > summary(rq(sand ~ method, data=x, tau=0.5), se='ker')
> >
> > # without weights, no error message
> > summary(rq(sand ~ method, data=x, tau=0.5, method='fn'), se='ker')
> >




More information about the R-help mailing list