[R] Singular design matrix in rq

William Dunlap wdunlap at tibco.com
Fri Apr 19 18:51:40 CEST 2013


I believe that those repeated values (more than half your x values are 0.0)
are causing bs() problems, because its default knots are at quantiles of the data
at equally spaced probabilities.  The following may be the same problem:

> set.seed(1)
> x <- c(rep(0, 20), 1:15)
> y <- sort(rnorm(length(x)))
> rq(y~bs(x, df=15), tau=.5)
Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
> # lm deals with a singular design matrix by dropping columns from the model
> lm(y~bs(x, df=15))

Call:
lm(formula = y ~ bs(x, df = 15))

Coefficients:
     (Intercept)   bs(x, df = 15)1   bs(x, df = 15)2   bs(x, df = 15)3  
         1.59024                NA                NA                NA  
 bs(x, df = 15)4   bs(x, df = 15)5   bs(x, df = 15)6   bs(x, df = 15)7  
              NA                NA                NA          -2.09983  
 bs(x, df = 15)8   bs(x, df = 15)9  bs(x, df = 15)10  bs(x, df = 15)11  
        -1.06874          -1.20798          -0.99340          -0.87365  
bs(x, df = 15)12  bs(x, df = 15)13  bs(x, df = 15)14  bs(x, df = 15)15  
        -0.71927          -0.50564           0.06184                NA  

> svd(cbind(1, bs(x, df=15)))$d # design matrix is not full rank
 [1] 7.029298e+00 2.773759e+00 1.286165e+00 1.160239e+00 9.992134e-01 8.102012e-01
 [7] 6.334326e-01 4.098332e-01 3.185013e-01 4.476983e-16 1.643202e-16 8.614772e-17
[13] 7.597613e-17 5.575475e-17 1.760443e-17 1.727013e-18    

Try using equally spaced knots or removing repeated quantiles when you call bs().

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Jonathan Greenberg
> Sent: Friday, April 19, 2013 6:29 AM
> To: Koenker, Roger W
> Cc: r-help
> Subject: Re: [R] Singular design matrix in rq
> 
> Roger:
> 
> Doh!  Just realized I had that error in the code -- raw_data is the same as
> mydata, so it should be:
> 
> mydata <- read.csv("singular.csv")
> plot(mydata$predictor,mydata$response)
> # A big cloud of points, nothing too weird
> summary(mydata)
> # No NAs:
> 
> #       X            response         predictor
> # Min.   :    1   Min.   :    0.0   Min.   : 0.000
> # 1st Qu.:12726   1st Qu.:  851.2   1st Qu.: 0.000
> # Median :25452   Median : 2737.0   Median : 0.000
> # Mean   :25452   Mean   : 3478.0   Mean   : 5.532
> # 3rd Qu.:38178   3rd Qu.: 5111.6   3rd Qu.: 5.652
> # Max.   :50903   Max.   :26677.8   Max.   :69.342
> 
> fit_spl <- rq(response ~ bs(predictor,df=15),tau=1,data=mydata)
> # Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
> 
> --j
> 
> 
> 
> On Fri, Apr 19, 2013 at 8:15 AM, Koenker, Roger W <rkoenker at illinois.edu>wrote:
> 
> > Jonathan,
> >
> > This is not what we call a reproducible example... what is raw_data?  Does
> > it have something to do with mydata?
> > what is i?
> >
> > Roger
> >
> > url:    www.econ.uiuc.edu/~roger            Roger Koenker
> > email    rkoenker at uiuc.edu            Department of Economics
> > vox:     217-333-4558                University of Illinois
> > fax:       217-244-6678                Urbana, IL 61801
> >
> > On Apr 16, 2013, at 2:58 PM, Greenberg, Jonathan wrote:
> >
> > > Quantreggers:
> > >
> > > I'm trying to run rq() on a dataset I posted at:
> > >
> > https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing
> > > (it's a 1500kb csv file named "singular.csv") and am getting the
> > following error:
> > >
> > > mydata <- read.csv("singular.csv")
> > > fit_spl <- rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1)
> > > > Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
> > >
> > > Any ideas what might be causing this or, more importantly, suggestions
> > for how to solve this?  I'm just trying to fit a smoothed hull to the top
> > of the data cloud (hence the large df).
> > >
> > > Thanks!
> > >
> > > --jonathan
> > >
> > >
> > > --
> > > Jonathan A. Greenberg, PhD
> > > Assistant Professor
> > > Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
> > > Department of Geography and Geographic Information Science
> > > University of Illinois at Urbana-Champaign
> > > 607 South Mathews Avenue, MC 150
> > > Urbana, IL 61801
> > > Phone: 217-300-1924
> > > http://www.geog.illinois.edu/~jgrn/
> > > AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
> >
> >
> 
> 
> --
> Jonathan A. Greenberg, PhD
> Assistant Professor
> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
> Department of Geography and Geographic Information Science
> University of Illinois at Urbana-Champaign
> 607 South Mathews Avenue, MC 150
> Urbana, IL 61801
> Phone: 217-300-1924
> http://www.geog.illinois.edu/~jgrn/
> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list