[R] Query on constrained regressions using -mgcv- and -pcls-

Clive Nicholas c||ve||@t@ @end|ng |rom goog|em@||@com
Tue Nov 3 02:14:46 CET 2020


Hello all,

I'll level with you: I'm puzzled!

How is it that this constrained regression routine using -pcls- runs
satisfactorily (courtesy of Tian Zheng):

library(mgcv)
options(digits=3)
x.1=rnorm(100, 0, 1)
x.2=rnorm(100, 0, 1)
x.3=rnorm(100, 0, 1)
x.4=rnorm(100, 0, 1)
y=1+0.5*x.1-0.2*x.2+0.3*x.3+0.1*x.4+rnorm(100, 0, 0.01)
x.mat=cbind(rep(1, length(y)), x.1, x.2, x.3, x.4)
ls.print(lsfit(x.mat, y, intercept=FALSE))
M=list(y=y,
w=rep(1, length(y)),
X=x.mat,
C=matrix(0,0,0),
p=rep(1, ncol(x.mat)),
off=array(0,0),
S=list(),
sp=array(0,0),
Ain=diag(ncol(x.mat)),
bin=rep(0, ncol(x.mat)) )
pcls(M)
Residual Standard Error=0.0095
R-Square=1
F-statistic (df=5, 95)=314735
p-value=0

    Estimate Std.Err t-value Pr(>|t|)
       1.000  0.0010  1043.9        0
x.1    0.501  0.0010   512.6        0
x.2   -0.202  0.0009  -231.6        0
x.3    0.298  0.0010   297.8        0
x.4    0.103  0.0011    94.8        0

but this one does not for a panel dataset:

set.seed(02102020)
N=500
M=10
rater=rep(1:M, each = N)
lead_n=as.factor(rep(1:N,M))
a=rep(rnorm(N),M)
z=rep(round(25+2*rnorm(N)+.2*a))
x=a+rnorm(N*M)
y=.5*x+5*a-.5*z+2*rnorm(N*M)
x_cl=rep(aggregate(x,list(lead_n) mean)[,2],M)
model=lm(y~x+x_cl+z)
summary(model)
y=1+1.5*x+4.6*x_cl-0.5*z
x.mat=cbind(rep(1,length(y)),x,x_cl,z)
ls.print(lsfit(x.mat,y,intercept=FALSE))

Residual Standard Error=0
R-Square=1
F-statistic (df=4, 4996)=5.06e+30
p-value=0

     Estimate Std.Err   t-value Pr(>|t|)
          1.0       0  2.89e+13        0
x         0.8       0  2.71e+14        0
x_cl      4.6       0  1.18e+15        0
z        -0.5       0 -3.63e+14        0

?

There shouldn't be anything wrong with the second set of data, unless I've
missed something obvious (that constraints don't work for panel data? Seems
unlikely to me)!

Also:

(1) I'm ultimately looking just to constrain ONE coefficient whilst
allowing the other coefficients to be unconstrained (I tried this with the
first dataset by setting

y=1+0.5*x.1-x.2+x.3+x.4

in the call, but got similar-looking output to what I got in the second
dataset); and

(2) it would be really useful to have the call to -pcls(M)- produce more
informative output (SEs, t-values, fit stats, etc).

Many thanks in anticipation of your expert help and being told what a
clueless berk I am,
Clive

-- 
Clive Nicholas

"My colleagues in the social sciences talk a great deal about methodology.
I prefer to call it style." -- Freeman J. Dyson

	[[alternative HTML version deleted]]



More information about the R-help mailing list