[R] Constrained Poisson model / Bayesian Poisson model
dwinsemius at comcast.net
Sat Jan 23 18:40:33 CET 2016
> On Jan 23, 2016, at 5:24 AM, mara.pfleiderer at uni-ulm.de wrote:
> Hi David,
> I'm sorry. I'm not familiar with posting problems on helppages.
> As the data I deal with is confidential I can't provide all details,
> but I will try to be as precise as possible about my problem:
> As I said I'm working on a Poisson regression model with a linear
> predictor and the identity as link function.
Then that is not what most people would call a "Poisson regression model".
In particular: I have data which consists of observations of two random variables X and Y over the period of about 3 months. I have clustered this data into hours, so I received 2088 observations of those variables and each observation represents the values of the variables in one hour. Now I assume Y to be Poisson distributed (Y_i~Poi(lambda_i)) and that the values of Y come from two different impacts, so I split Y_i into two Poisson-distributed variables Y_i1 and Y_i2. I assume that the values of X have a long-term effect (of at least one day) on the part Y_i1 and I have estimators for the parameters lambda_i2, but I have no exact values of the variables of Y_i1 and Y_i2 but only of Y_i as a whole.
> So my model looks as follows:
> fml=Y_i1 ~ b_1*X_i+....+b_n*X_(i-n+1) - lambda_i2 -1 with n>=24
> That means that the last 24 (or more) values of X influence the value of Y_i1 in an additive way and I have no intercept(b_0=0).
> Now I made a matrix whose rows represent Y_i, all of its 24 (or more) regressors for each observation of Y_i and the corresponding estimator lambda_i2.
> Then I used glm(fml, family=poisson(link="identity"), data=matrix) and tried it for different values of n (=24,48,36,...).
I worry that might not construct the model you described above. The use of `poisson` constructs Poisson distributed _errors_ with an additive link, and to my understanding does _not_ assume Poisson distributed Y values.
> But always some of the coefficients received negative values which doesnt't make sense in interpretation. (The values of X represent certain events which can only have a positive or none effect on the value of Y.)
> Now I want to use the constraint b_i >=0 in my model, but I don't know how I can do this.
> I also thought of analyzing this model in a Bayesian way, but yet I haven't found a Bayesian version of glm() for the Poisson distribution such that I can specify the prior for the b_i on my own. (Then I could include the positivity in the prior.) Do you have a hint for me?
The R blogoshere has been heralding the arrival of the `rstanarm` package which provides a glm-like interface to an MCMC (Bayesian) estimation engine. Version 2.90 is on CRAN. The list of authors in the description file is impressive:
Authors at R: c(person("Jonah", "Gabry", email = "jsg2201 at columbia.edu", role = "aut"),
person("Trustees of", "Columbia University", role = "cph"),
person("R Core", "Deveopment Team", role = "cph",
comment = "R/pp_data.R, R/stan_aov.R"),
person("Douglas", "Bates", role = "cph", comment = "R/pp_data.R"),
person("Martin", "Maechler", role = "cph", comment = "R/pp_data.R"),
person("Ben", "Bolker", role = "cph", comment = "R/pp_data.R"),
person("Steve", "Walker", role = "cph", comment = "R/pp_data.R"),
person("Brian", "Ripley", role = "cph",
comment = "R/stan_aov.R, R/stan_polr.R"),
person("William", "Venables", role = "cph", comment = "R/stan_polr.R"),
person("Ben", "Goodrich", email = "benjamin.goodrich at columbia.edu",
role = c("cre", "aut")))
> I'm sorry if I went too much into detail now...
> I hope you understand my point now and have some answers for me!
> Zitat von David Winsemius <dwinsemius at comcast.net>:
>>> On Jan 22, 2016, at 7:01 AM, mara.pfleiderer at uni-ulm.de wrote:
>>> Hi all,
>>> I am dealing with a problem about my linear Poisson regression model (link function=identity).
>>> I am using the glm()-function which results in negative coefficients, but a negative influence of the regressors wouldn't make sense.
>> Negative coefficients merely indicate a lower relative rate. You need to be more specific about the exactly data and model output before you can raise our concern to a level where further comment can be made.
>>> (i) Is there a possibility to set constraints on the regression parameters in glm() such that all coefficients are positive? Or is there another function in R for which this is possible?
>>> (ii) Is there a Bayesian version of the glm()-function where I can specify the prior distribution for my regression parameters? (e.g. a Dirichlet prior s.t. the parameters are positive)
>>> All this with respect to the linear Poisson model...
>> As I implied above, the word "linear" means something different than "additive" when the link is log().
>> David Winsemius
>> Alameda, CA, USA
Alameda, CA, USA
More information about the R-help