[R] Predicting hurdle model results on spatial scale
valentina.lauria at nuigalway.ie
Mon Oct 21 18:28:47 CEST 2013
I apologise in advance for all my questions.
I am interested to predict the habitat selection of fish species using the hurdle model. I know that I can perform this in R with the function predict.hurdle() on newdata, however how this work is not entirely clear.
Usually with a two-step approach a binary and a poisson models are created to deal with zero-inflated and over-dispersed data, then the binary model is multiplied by the poisson model in order weight the predictions. Is this already included in the predict.hurdle function?
Also I am using the function dredge (from the MuMin package) to select my best model based on AIC, still in this case the best model selected seems to be a combination between the truncated poisson and the binary model (hurdle model). Is there any way that I could dredge the two model components separately? I did some research and in the NEWS section I found that a package pscf was created for this but when I did more digging around I did not have much luck.
I would be grateful if someone could help me.
Thank you very much once again,
From: Achim Zeileis [mailto:Achim.Zeileis at uibk.ac.at]
Sent: 18 October 2013 18:57
To: Lauria, Valentina
Cc: r-help at r-project.org
Subject: Re: [R] hurdle model error why does need integer values for the dependent variable?
On Fri, 18 Oct 2013, Lauria, Valentina wrote:
> Dear list,
> I am using the hurdle model for modelling the habitat of rare fish
> species. However I do get an error message when I try to model my data:
>> test_new1<-hurdle(GALUMEL~ depth + sal + slope + vrm + lat:long +
>> offset(log(haul_numb)), dist = "negbin", data = datafit_elasmo)
> Error in hurdle(GALUMEL ~ depth + sal + slope + vrm + lat:long + offset(log(haul_numb)), :
> invalid dependent variable, non-integer values
> When I do fit the same model with round(my dependent variable) the
> model works. Sorry for the stupid question but could anyone explain me
> why? My data are zero inflated (zeros occurring for 78%) and positively skewed.
hurdle() fits a count data distribution (poisson, negbin, geometric) by maximum likelihood. Hence, its response needs to be a count variable (i.e., integer). See vignette("countreg", package = "pscl") for the underlying likelihoods employed.
> Thank you very much in advance.
> Kind Regards,
> [[alternative HTML version deleted]]
> R-help at r-project.org mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help