[R] Tetrachoric correlation in R vs. stata

Janet Rosenbaum jrosenba at rand.org
Fri Jun 23 23:33:31 CEST 2006


Peter --- Thanks for pointing out the omitted information.  The hazards 
of attempting to be brief.

In R, I am using polychor(vec1, vec2, std.err=T) and have used both the 
ML and 2 step estimates, which give virtually identical answers.  I am 
explicitly using only the 632 complete cases in R to make sure missing 
data is handled the same way as in stata.

Here's my data:

522	54
34	22

> polychor(v1, v2, std.err=T, ML=T)

Polychoric Correlation, ML est. = 0.5172 (0.08048)
Test of bivariate normality: Chisquare = 8.063e-06, df = 0, p = NaN

    Row Thresholds
    Threshold Std.Err.
  1     1.349  0.07042


    Column Thresholds
    Threshold Std.Err.
  1     1.174  0.06458
  Warning message:
  NaNs produced in: pchisq(q, df, lower.tail, log.p)

In stata, I get:

. tetrachoric t1_v19a ct1_ix17

Tetrachoric correlations (N=632)

----------------------------------
     Variable |  t1_v19a  ct1_ix17
-------------+--------------------
      t1_v19a |        1
     ct1_ix17 |    .6169         1
----------------------------------

Thanks for your help.

Janet



Peter Dalgaard wrote:
> Janet Rosenbaum <jrosenba at rand.org> writes:
> 
>> I hope someone here knows the answer to this since it will save me from 
>> delving deep into documentation.
>>
>> Based on 22 pairs of vectors, I have noticed that tetrachoric 
>> correlation coefficients in stata are almost uniformly higher than those 
>> in R, sometimes dramatically so (TCC=.61 in stata, .51 in R;  .51 in 
>> stata, .39 in R).  Stata's estimate is higher than R's in 20 out of 22 
>> computations, although the estimates always fall within the 95% CI for 
>> the TCC calculated by R.
>>
>> Do stata and R calculate TCC in dramatically different ways?  Is the 
>> handling of missing data perhaps different?  Any thoughts?
>>
>> Btw, I am sending this question only to the R-help list.
> 
> 
> A bit more information seems necessary:
> 
> - tetrachoric correlations depend on 4 numbers, so you should be able
>   to give a direct example
> 
> - you're not telling us how you calculate the TCC in R. This is not
>   obvious (package polycor?).
> 


--------------------

This email message is for the sole use of the intended recip...{{dropped}}



More information about the R-help mailing list