[R] Nested variables

Mark Difford mark_difford at yahoo.co.uk
Wed Jun 3 16:53:23 CEST 2009


Hi Daniel,

>> It now only displays Habitat and not Site which it take is because Site
>> has been nested within 
>> Habitat (?).

No. Start with a simple model and look at the coefficients. That way you
will be able to match the model you want to its symbolic representation.
Then built it up (do you have to?).

## Toy variations
zeroinfl(Visitation ~ Habitat/Site, data = data1)               ## nests
Site in Habitat
zeroinfl(Visitation ~ Habitat/Site - 1, data = data1)          ## same, but
more common spec
zeroinfl(Visitation ~ Habitat/Site - 1 - Habitat, data = data1)  ## just the
nesting
zeroinfl(Visitation ~ Habitat:Site - 1, data = data1)          ## same as
directly above
zeroinfl(Visitation ~ Habitat:Site, data = data1)
zeroinfl(Visitation ~ Sps + Habitat/Site - 1, data = data1) ## add a main
effect

Do you know what contrasts are and which contrasts you are using? This
effectively reduces to, "Do you know how to interpret the output from
summary()?"

You are fitting a monstrously complicated model. Have you looked at some of
these presumed relationships graphically? I would strongly advice you to
simplify: use a tool like ctree (party), mvpart/rpart (mvpart), earth/mars,
randomForest, or something else to discover what is important.

Regards, Mark.


DanielWC wrote:
> 
>>>What happens if you do something like the following? 
> 
>>>zeroinfl(Visitation ~ Sps*Habitat/Site - Sps:Habitat:Site, data = data1) 
>>>zeroinfl(Visitation ~ Sps*Habitat/Site - Sps - Sps:Habitat:Site, data =
data1) 
> 
> It seems to be working. Heres the model:
> 
> nes3<-zeroinfl(Visitation ~ Sps + TP + Resource + Habitat/Site +
> Sps*Habitat/Site - Sps:Habitat:Site - Habitat*Site + Sps*Resource + Sps*TP
> + TP*Resource + TP*Habitat/Site - TP:Habitat:Site | TP + Sps + Resource +
> Habitat/Site + TP*Resource + TP*Habitat/Site - TP:Habitat:Site -
> Habitat*Site, data = data1)
> 
> and the summary(nes3):
> 
> Count model coefficients (poisson with log link):
>                Estimate Std. Error z value Pr(>|z|)    
> (Intercept)   2.4589163  0.2145054  11.463  < 2e-16 ***
> Sps          -0.2404691  0.0749488  -3.208 0.001335 ** 
> TP           -0.3191130  0.0764331  -4.175 2.98e-05 ***
> Resource      0.0114463  0.0008878  12.892  < 2e-16 ***
> Habitat      -0.4589395  0.0709937  -6.465 1.02e-10 ***
> Sps:Habitat   0.2213486  0.0214113  10.338  < 2e-16 ***
> Sps:Resource -0.0040594  0.0002697 -15.049  < 2e-16 ***
> Sps:TP        0.0448839  0.0177702   2.526 0.011544 *  
> TP:Resource  -0.0009182  0.0002574  -3.567 0.000361 ***
> TP:Habitat    0.0531861  0.0188101   2.828 0.004691 ** 
> 
> Zero-inflation model coefficients (binomial with logit link):
>               Estimate Std. Error z value Pr(>|z|)    
> (Intercept) -1.2634867  0.2990362  -4.225 2.39e-05 ***
> TP           0.5649008  0.1101841   5.127 2.95e-07 ***
> Sps          0.6861959  0.0553519  12.397  < 2e-16 ***
> Resource    -0.0111935  0.0025538  -4.383 1.17e-05 ***
> TP:Resource  0.0015815  0.0008029   1.970   0.0489 *  
> TP:Habitat  -0.1064294  0.0269142  -3.954 7.67e-05 ***
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
> 
> Number of iterations in BFGS optimization: 29 
> Log-likelihood: -1237 on 16 Df
> 
> 
> It now only displays Habitat and not Site which it take is because Site
> has been nested within Habitat (?).
> 
> Thanks for the help.
> 
> 
> 
> Mark Difford wrote:
>> 
>> Hi Daniel,
>> 
>>>> I read the chapter and it says that I can nest B in A by writing A/B.
>>>> But it doesnt seem to 
>>>> work properly.
>> 
>> Then you need to read it again, and you need to play with some simple
>> models, so that you learn how to specify a model properly, using the
>> formula interface. (Achim Zeileis wrote the pscl function, which uses the
>> formula interface, so it almost certainly works as it should.)
>> 
>> It also is not at all clear what model you are trying to fit, since I
>> only have an R-specified model to go on, and your concept of what that
>> means is different from mine (and R's).
>> 
>>>> If I write Sps*Habitat/Site R somehow thinks that its a 3-way
>>>> interaction link and still 
>>>> includes habitat as a single variable.
>> 
>> It will, because that is what you are specifying.
>> 
>> Reread this (from §11.1):
>> y ~ A*B
>> y ~ A + B + A:B
>> y ~ B %in% A
>> y ~ A/B
>> Two factor non-additive model of y on A and B. The first two specify the
>> same crossed classification and the second two specify the same nested
>> classification. In abstract terms all four specify the same model
>> subspace.
>> 
>> And this (from §11.1):
>> y ~ (A + B + C)^2
>> y ~ A*B*C - A:B:C
>> Three factor experiment but with a model containing main effects and two
>> factor interactions only. Both formulae specify the same model.
>> 
>> What happens if you do something like the following?
>> 
>> zeroinfl(Visitation ~ Sps*Habitat/Site - Sps:Habitat:Site, data = data1)
>> zeroinfl(Visitation ~ Sps*Habitat/Site - Sps - Sps:Habitat:Site, data =
>> data1)
>> 
>> Regards,
>> Mark.
>> 
>> 
>> DanielWC wrote:
>>> 
>>> Hi again
>>> I read the chapter and it says that I can nest B in A by writing A/B.
>>> But it doesnt seem to work properly. The package I use is pscl, I am
>>> trying to model zero inflated data. Here is the full model:
>>> 
>>> zip1<-zeroinfl(Visitation ~ Habitat/Site + Sps + TP + Resource +
>>> Sps*Habitat/Site + Sps*Resource + Sps*TP + TP*Resource + TP*Habitat/Site
>>> | TP + Sps + Resource + Habitat/Site + TP*Sps + Sps*Resource +
>>> TP*Resource + TP*Habitat/Site, data = data1)
>>> 
>>> where as you can see I have tried to include Habitat/Site. But it doesnt
>>> seem like its possible to create interaction links that way. I would
>>> like create the interaction link between say Sps and Habitat/Site. If I
>>> write Sps*Habitat/Site R somehow thinks that its a 3-way interaction
>>> link and still includes habitat as a single variable. Here is the
>>> summary just so you can see:
>>> 
>>> Count model coefficients (poisson with log link):
>>>                    Estimate Std. Error z value Pr(>|z|)    
>>> (Intercept)       3.8643338  0.3450539  11.199  < 2e-16 ***
>>> Habitat          -3.2757473  0.4871726  -6.724 1.77e-11 ***
>>> Sps              -0.3332371  0.1117680  -2.982  0.00287 ** 
>>> TP               -0.3857513  0.0975556  -3.954 7.68e-05 ***
>>> Resource          0.0130935  0.0009545  13.717  < 2e-16 ***
>>> Habitat:Site      0.2487012  0.0415908   5.980 2.24e-09 ***
>>> Habitat:Sps       0.5432845  0.1230434   4.415 1.01e-05 ***
>>> Sps:Resource     -0.0041420  0.0002632 -15.738  < 2e-16 ***
>>> Sps:TP            0.0438085  0.0189411   2.313  0.02073 *  
>>> TP:Resource      -0.0009413  0.0002887  -3.260  0.00111 ** 
>>> Habitat:TP        0.1549435  0.1205884   1.285  0.19883    
>>> Habitat:Site:Sps -0.0305628  0.0107794  -2.835  0.00458 ** 
>>> Habitat:Site:TP  -0.0096408  0.0098497  -0.979  0.32768    
>>> 
>>> Zero-inflation model coefficients (binomial with logit link):
>>>                   Estimate Std. Error z value Pr(>|z|)    
>>> (Intercept)     -1.3160850  0.8200631  -1.605   0.1085    
>>> TP               0.2808474  0.2657544   1.057   0.2906    
>>> Sps              0.8023648  0.1281520   6.261 3.82e-10 ***
>>> Resource        -0.0075967  0.0035817  -2.121   0.0339 *  
>>> Habitat         -1.3122811  1.1348922  -1.156   0.2476    
>>> Habitat:Site     0.1390498  0.0984716   1.412   0.1579    
>>> TP:Sps          -0.0189088  0.0406299  -0.465   0.6417    
>>> Sps:Resource    -0.0005548  0.0006843  -0.811   0.4175    
>>> TP:Resource      0.0007100  0.0008454   0.840   0.4010    
>>> TP:Habitat       0.7195254  0.3666809   1.962   0.0497 *  
>>> TP:Habitat:Site -0.0782610  0.0314229  -2.491   0.0128 *
>>> 
>>> Thanks again
>>> 
>>> 
>>> Mark Difford wrote:
>>>> 
>>>> Daniel,
>>>> 
>>>>>> Yes I am trying to model such data, and i need R to know that Site is
>>>>>> nested within Habitat. 
>>>>>> Do I use some kind of command before running the model (like factor()
>>>>>> and so on) or do i 
>>>>>> write it in the model formula. If so, how?
>>>> 
>>>> You still are not telling the list enough, since nesting specifications
>>>> are sometimes package/program specific. You could help yourself by
>>>> reading some of R's basic documentation about how to specify a nested
>>>> model. Then "it" will tell you what to do.
>>>> 
>>>> ?formula
>>>> 
>>>> Or read §11.1 of "An Introduction to R" (one of R's manuals).
>>>> 
>>>> Regards, Mark.
>>>> 
>>>> 
>>>> DanielWC wrote:
>>>>> 
>>>>> Hi. 
>>>>> Yes I am trying to model such data, and i need R to know that Site is
>>>>> nested within Habitat. Do I use some kind of command before running
>>>>> the model (like factor() and so on) or do i write it in the model
>>>>> formula. If so, how?
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> 
>>>>> 
>>>>> Douglas Bates-2 wrote:
>>>>>> 
>>>>>> On Fri, May 29, 2009 at 7:50 AM, DanielWC
>>>>>> <daniel.carstensen at gmail.com> wrote:
>>>>>> 
>>>>>>> Hello
>>>>>>> I am working with a biological data including variables called
>>>>>>> Habitat and
>>>>>>> Site, example:
>>>>>> 
>>>>>>> Habitat     Site
>>>>>> 
>>>>>>> Forest      Low
>>>>>>> Forest      Low
>>>>>>> Forest      High
>>>>>>> Forest      High
>>>>>> 
>>>>>>> I want to tell R that the Site variable is nested within the Forest
>>>>>>> variable
>>>>>>> (that it is not a new variable).
>>>>>>> Does anyone know how to do this?
>>>>>> 
>>>>>> If you could explain what you mean by "tell R", it would help.  Are
>>>>>> you trying to model such data and you want to know how to express the
>>>>>> relationship between such factors in a model formula?
>>>>>> 
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Nested-variables-tp23779402p23853046.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list