# [R] distributions and glm

drbn drbn at yahoo.com
Tue Oct 21 20:22:43 CEST 2008

```Thanks for your clear response, Ruben.

I'm trying to fit the data of each year in a distribution because my data is
truncated at fixed points to the right and to the left (each year). By
fitting a truncate distribution I think I will be able to get an unbiased
estimate of the yearly mean.

But, as you said, each mean come from different distributions. As I'm not
able to write my own glm, I'd like to know if posibly exists some
alternative. Is possible, for instance, to model the data of each year with
a nonlinear function and estimate the mean and other parameters from this
function? Are these means more appropiate to use in a glm?

David

Ruben Roa Ureta wrote:
>
> drbn wrote:
>> Hello,
>> I have seen that some papers do this:
>>
>> 1.) Group data by year (e.g. 35 years)
>>
>> 2.) Estimate the mean of the key variable through the distribution that
>> fits
>> better (some years is a normal distribution , others is a more skewed,
>> gamma
>> distribution, etc.)
>>
>> 3.) With these estimated means of each year do a GLM.
>>
>> I'd like to know if it is possible (to use these means in a GLM) or is a
>> wrong idea.
>>
>>
>> David
>>
> David,
> You can model functions of data, such as means, but you must be careful
> to carry over most of the uncertainty in the original data into the
> model. If you don't, for example if you let the model know only the
> values of the means, then you are actually assuming that these means
> were observed with absolute certainty instead of being estimated from
> the data. To carry over the uncertainty in the original data to your
> modeling you can use a Bayesian approach or you can use a marginal
> likelihood approach. A marginal likelihood is a true likelihood function
> not of the data, but of functions of the data, such as of maximum
> likelihood estimates. If your means per year were estimated using
> maximum likelihood (for example with fitdistr in package MASS) and you
> sample size is not too small then you can use a normal marginal
> likelihood model for the means. Note however that each mean may come
> from a different distribution so the full likelihood model for your data
> would be a mixture of normal distributions. You may not be able to use
> the pre-built glm function so you may face the challenge to write your
> own code.
> HTH
> Rubén
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help