[R] Help with SEM package - model significance

Tue Jul 22 03:40:14 CEST 2014

Dear Bernado,

 This isn't really a suitable topic to pursue on the r-help list, so I'll just comment briefly:

On Mon, 21 Jul 2014 17:34:52 -0700
 Bernardo Santos <bernardo_brandaum at yahoo.com.br> wrote:
> Hi John,
> 
> Thanks for your reply (1 month later lol).
> In fact maybe the point is that I do not understand exactly the role of latent variables (what they are, and how to define them in R) in SEM.
> Do you have any suggestion of easy basic literature on SEM that can help me with that?
> Most things I have read are old (and use some different statistics programs) and offer examples too far from ecology, than I had some difficulties to understand the method in general.
> 

I have some materials at <http://socserv.mcmaster.ca/jfox/Courses/sem-goettingen/index.html> from a recent workshop on SEMs, including some reading and suggestions for reading, but not I'm afraid from ecology.

> But, as far as I understood, SEM is like a simple multiple regression (linear model), but that takes into account the relation of different variables simultaneously, isn't that?
> 

In SEMs the response variable from one regression equation can be an explanatory variable in another, and the models can incorporate latent variables, which aren't measured directly, but rather indirectly through their observable effects ("indicators") or even in some cases through their observable causes.

I hope this helps,
 John

> Thank you very much.
> Best regards,
> 
> Bernardo
> 
> 
> 
> Em Segunda-feira, 16 de Junho de 2014 8:40, John Fox <jfox at mcmaster.ca> escreveu:
>  
> 
> 
> Dear Bernardo,
> 
> The df for the LR chisquare over-identification test come not from the number of observations, but from the difference between the number of observable variances and covariances, on the one hand, and free parameters to estimate, on the other. In your case, these numbers are equal, and so df = 0. The LR chisquare for a just-identified model is also necessarily 0: the model perfectly reproduces the covariational structure of the observed variables. 
> 
> R (and most statistical software) by default writes very small and very large numbers in scientific format. In your case, -2.873188e-13 = -2.87*10^-13, that is, 0 within rounding error. You can change the way numbers are printed with the R scipen option.
> 
> Some other observations:
> 
> (1) Your model is recursive and has no latent variables; you would get the same estimates from OLS regression using lm().
> 
> (2) For quite some time now, the sem package has included specifyEquations() as a more convenient way of specifying a model, in preference to specifyModel(). See ?specifyEquations.
> 
> (3) You don't have to specify the error variances directly; specifyEquations(), or specifyModel(), will supply them.
> 
> I hope this helps,
> John
> 
> 
> ------------------------------------------------
> John Fox, Professor
> McMaster University
> Hamilton, Ontario, Canada
> http://socserv.mcmaster.ca/jfox/
>     
>     
> On Sun, 15 Jun 2014 20:15:31 -0700 (PDT)
> Bernardo Santos <bernardo_brandaum at yahoo.com.br> wrote:
> > Dear all, 
> > 
> > I used "sem" function from the package SEM to fit a model. However, I cannot say if the model is correspondent to the data or not (chisquare test).
> > I used the commands:
> > 
> > model1 <- specifyModel()
> > estadio -> compflora, a1, NA
> > estadio -> compfauna, a2, NA
> > estadio -> interacoesobs, a3, NA
> > compflora -> compfauna, b1, NA
> > compflora -> interacoesobs, b2, NA
> > compfauna -> interacoesobs, c1, NA
> > estadio <-> estadio, e1, NA
> > compflora <-> compflora, e2, NA
> > compfauna <-> compfauna, e3, NA
> > interacoesobs <-> interacoesobs, e4, NA
> > 
> > sem1 <- sem(model1, cov.matrix, length(samples))
> > summary(sem1)
> > 
> > and I got the result:
> > 
> > Model Chisquare =  -2.873188e-13   Df =  0 Pr(>Chisq) = NA AIC =  20 BIC =  -2.873188e-13 Normalized Residuals Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
> > 0.000e+00 0.000e+00 2.957e-16 3.193e-16 5.044e-16 8.141e-16  R-square for Endogenous Variables compflora     compfauna interacoesobs  0.0657        0.1056        0.2319  Parameter Estimates Estimate     Std Error    z value    Pr(>|z|)                                    
> > a1 3.027344e-01 1.665395e-01 1.81779316 6.909575e-02 compflora <--- estadio          
> > a2 2.189427e-01 1.767404e-01 1.23878105 2.154266e-01 compfauna <--- estadio          
> > a3 7.314192e-03 1.063613e-01 0.06876742 9.451748e-01 interacoesobs <--- estadio      
> > b1 2.422906e-01 1.496290e-01 1.61927587 1.053879e-01 compfauna <--- compflora        
> > b2 3.029933e-01 9.104901e-02 3.32780446 8.753328e-04 interacoesobs <--- compflora    
> > c1 4.863368e-02 8.638177e-02 0.56300857 5.734290e-01 interacoesobs <--- compfauna    
> > e1 6.918133e+04 1.427102e+04 4.84767986 1.249138e-06 estadio <--> estadio            
> > e2 9.018230e+04 1.860319e+04 4.84767986 1.249138e-06 compflora <--> compflora        
> > e3 9.489661e+04 1.957568e+04 4.84767986 1.249138e-06 compfauna <--> compfauna        
> > e4 3.328072e+04 6.865289e+03 4.84767986 1.249138e-06 interacoesobs <--> interacoesobs Iterations =  0 
> > 
> > I understand the results, but I do not know how to interpret the first line that tells me about the model:
> > Model Chisquare =  -2.873188e-13   Df =  0 Pr(>Chisq) = NA
> > 
> > How can DF be zero, if the number of observations I used in sem funcition was 48 and I have only 4 variables? What is the p value?
> > 
> > Thanks in advance.
> > Bernardo Niebuhr
> >     [[alternative HTML version deleted]]
> > 

------------------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/