[R] R: Count function calls

Mon Mar 11 10:15:10 CET 2013

Dear Simon,

unfortunately I only have a few minutes for quick answers now, while your interesting email requires some experimenting on my part I cannot do now, although I will try to as soon as I have some spare time.

The much simpler (although asymptotically equivalent) pmg estimator is more robust in this respect than the more sophisticated pvcm a la Swamy. The latter is more efficient in small samples but less tolerant to such problems.

Anyway, as you said the tolerances may vary among different software, along the fine line between numerical accuracy and procedure robustness.

You are welcome to experiment on this yourself! I'll be glad if you keep me posted.
Cheers,
Giovanni

-----Messaggio originale-----
Da: Simon Zehnder [mailto:szehnder at uni-bonn.de] 
Inviato: giovedì 7 marzo 2013 13.12
A: Millo Giovanni
Cc: r-help at r-project.org help
Oggetto: Re: [R] Count function calls

Dear Giovanni,

apologize for this late reply! I was testing and reading a lot of stuff. I tried your suggestions and the problem of singularity in the regressor cross product vanishes when using the Group Mean function 'pgm' instead of 'pvcm'. Nevertheless, I found the collinearity in the regressors and 'pvcm' ran further, until I got another error. This time a little different but still directed towards the same area:

'Error in solve.default(crossprod(X[[i]])[!coefna[i, ], !coefna[i, ]])) :
	system is computationally singular: reciprocal condition number = 1.86131e-17'

Following the description of R's function 'rcond', the reciprocal condition number measures how close a matrix is to be rank deficient. So in this case one regressor crossproduct seems to be sufficiently close to singularity, such that inversion becomes impossible. 

I attached you a simple subsample. If you let run the following commands on it:

require(plm)
data <- read.csv(path, sep = ",", header = TRUE) pdata <- plm.data(data, index = c("gvkey", "fyear"))
debug(plm:::pvcm)
model.pvcm <- pvcm(LOGDXSGA~LOGDSALE + DECRLOGDSALE + DECRLOGDSALEWDAVG, data = pdata, model = "random") ...... go until Browse[2]>
debug: ml <- split(data cond)

Then type:
A <- as.matrix(ml$'25062'[, 2:4])
XX <- t(A) %*% A
qr(XX)$rank
[1] 2                            Aha, the rank is only 2! 
rcond(XX) 
[1] 9.339817e-16    Alright, the matrix XX is pretty near to singularity, even it is not really singular!

Let us have a look at it:
A[, 2]/A[, 3]
.....                              A[, 2] is almost a multiple of A[, 3], if the latter is multiplied by -6.231979 or -6.231331 ..... at least it suffices to let the rank shrink to 2. 

The thing is, the same regression goes through in the Stata package. As I would say, I am an R fanatist, I would like to know, why it runs in Stata and not in R.....and that one can let it run in R as well. Different number formats/precision? Maybe it suffices in such a case to enforce full rank of the crossproduct by enforcing positive definitess, for instance via the function 'nearPD' in the R package 'Matrix':

Try
library(Matrix)
qr(nearPD(XX)$mat)$rank
[1] 3                                                 Yey!..... :)

Tell me your opinion Giovanni! Give me a little help here please! 

Best 

Simon

P.S. Thank your for this fantastic package making panel data estimation possible in the R environment! 

---------------- end original message --------------

Ai sensi del D.Lgs. 196/2003 si precisa che le informazi...{{dropped:12}}