[R] polr probit versus stata oprobit

Jean Eid jeaneid at chass.utoronto.ca
Thu Nov 11 01:46:21 CET 2004


Dear All,
I have been struggling to understand why for the housing data in MASS
library R and stata give coef. estimates that are really different. I also
tried to come up with many many examples myself (see below, of course I
did not have the set.seed command included) and all of my
`random' examples seem to give verry similar output. For the housing data,
I have changed the data into numeric vectors instead of factors/ordered
factors. I did so to try and get the same results as in STATA and to have
the housing example as close as possible to the one I constructed.

I run a debian sid, kernel 2.4, R 2.0.0, and STATA version 8.2, MASS
version  7.2-8.


here's the example ( I assume that you have STATA installed and can run in
batch mode, if not the output is also given below)

library(MASS)
library(foreign)
set.seed(100)
X <- rnorm(1000)
X1 <- rnorm(1000)
X2 <- rnorm(1000)
X <- X +X1+X2
XX <- X<=quantile(X, .25)
XX[X>quantile(X, .25) & X<=quantile(X, .50)] <- 2
XX[X>quantile(X, .5) & X<=quantile(X, .75)] <- 3
XX[X>quantile(X, .75)] <- 4
temp <- data.frame(XX=XX, X1=X1, X2=X2, X=X)
summary(polr(factor(XX)~X1 +X2, data=temp, method="probit"))
write.dta(temp, "temp.dta")
####################################
#Stata stuff
####################################
cat("use temp.dta\n oprobit XX X1 X2\n", file="temp.ado")
system("stata -b do temp.ado&")
system("cat temp.log")


#
##### here's R's output
#############################
Re-fitting to get Hessian

Call:
polr(formula = factor(XX) ~ X1 + X2, data = temp, method = "probit")

Coefficients:
       Value Std. Error  t value
X1 0.9891735 0.04749225 20.82811
X2 0.9400804 0.04527653 20.76309

Intercepts:
    Value    Std. Error t value
1|2  -1.1411   0.0572   -19.9613
2|3  -0.0372   0.0486    -0.7656
3|4   1.1101   0.0579    19.1865

Residual Deviance: 1969.734
AIC: 1979.734


##############################################
#and here  Stata's output
##############################################

Ordered probit estimates                          Number of obs   =
1000
                                                  LR chi2(2)      =
802.86
                                                  Prob > chi2     =
0.0000
Log likelihood = -984.86675                       Pseudo R2       =
0.2896

------------------------------------------------------------------------------
          XX |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          X1 |   .9891651   .0474922    20.83   0.000     .8960822    1.082248
          X2 |     .94007   .0452765    20.76   0.000     .8513298     1.02881
-------------+----------------------------------------------------------------
       _cut1 |  -1.141119   .0571667          (Ancillary parameters)
       _cut2 |  -.0371779   .0485592
       _cut3 |   1.110117   .0578593




More information about the R-help mailing list