[R] Outliers and overdispersion

Tue Aug 13 17:41:10 CEST 2013

I do not know what you are exactly estimating, but if it is about count models and the model fit gets better when you drop the outliers, it does not say, that the model is now more correct. It just says, if the data were without the outliers, this model would fit good. 

Overdispersion in count data is sometimes a cue, that you have a mixture distribution as the generating process - for example instead of one, K different (sub)species of birds which were aggregated in the count data. In this case a mixture (negative binomial)- distribution with K components could fit the data better. 

Best

Simon

On Aug 13, 2013, at 5:28 PM, Marta Lomas <lomasvega at hotmail.com> wrote:

> 
> 
> 
> Hi  again,  
> 
> I have a question on some outliers that I have in my response variable (wich are bird counts). At the beginning I did not drop them
> out because they are part of the normal counts and I considered them "ecologically" correct. 
> 
> However, I 
> tried some of the same models without ouliers and the AICs are thus better. I
> also have nice significances this way...
> 
> 
> So would you say that, even though the outliers are right 
> observations and taking into consideration that already the negative binomial 
> distribution that I am using is accounting for the some of the overdispersion due to the outliers, it is
> better to drop them out as the models fit better this way? 
> 
> 
> Thanks for your patience!
> 
> :)
> 
> 
> 
> 
> 
> 		 	   		  
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.