[R] Testing for significant differences between groups in multiple linear regression

Bert Gunter gunter.berton at gene.com
Fri Jan 23 18:43:42 CET 2015


Look no further!  The answer is yes.

However,  if you are interested in why your query is probably nonsense
and why overall tests of significance are a **really bad idea** in
most scientific contexts (imho, anyway), then I suggest you post to a
statistical list like stats.stackexchange.com .

... oh, and while you're at it, please read the posting guide for this
list (see link below) and, in particular, DO NOT POST IN HTML, which,
as you can see here, often becomes a mess on this **plain text**
mailing list.

Cheers,
Bert


Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Fri, Jan 23, 2015 at 1:46 AM, Janka Vanschoenwinkel
<janka.vanschoenwinkel at uhasselt.be> wrote:
> Dear R-colleagues,
>
> I am looking for a way to test whether one regression has significant
> different coefficients and overall results for 10 groups (grouping variable
> is "irr").
>
>
>
> *What I have*
>
> The regression is:
>
> Depend = temp + temp² + perc + perc² + conti è split up for multiple groups
> of irr
>
>
>   *Dataset = Alldata (real dataset has over 50000 IDs)*
>
> *ID*
>
> *irr *
>
> *(= grouping variable)*
>
> *temp*
>
> *perc*
>
> *conti*
>
> *Depend*
>
> *w*
>
> 1
>
> 1
>
> 10
>
> 34
>
> 26
>
> 8
>
> 23
>
> 2
>
> 1
>
> 11
>
> 36
>
> 27
>
> 6
>
> 58
>
> 3
>
> 1
>
> 26
>
> 57
>
> 45
>
> 3
>
> 76
>
> 4
>
> 2
>
> 23
>
> 68
>
> 24
>
> 2
>
> 4
>
> 5
>
> 2
>
> 6
>
> 26
>
> 8
>
> 1
>
> 323
>
> 6
>
> 2
>
> 3
>
> 17
>
> 56
>
> 6
>
> 45
>
> 7
>
> 3
>
> 17
>
> 39
>
> 17
>
> 5
>
> 57
>
>
>
> I can obtain the different regression coefficients for the different groups
> with the following code (other codes are possible as wel).
>
>
> datairrigation <- split(Alldata, Alldata$irr)
>
> model.per.irrigation <- lapply(datairrigation, function (x) {
>
>   lm(Depend~ temp + temp² + perc + perc² + conti,
>
>      weights=w, data = x)
>
> })
>
>
> OR I can do it manually by splitting all the data in subsets (and then I
> also receive the R²…)
>
>
>
> *What I don’t have*
>
> However, now I don’t know how to compare those regressions to test whether
> they differ significantly over all the groups.
>
> (Preferably, I would like to test the coefficients individually (temp(group
> 1) = temp(group2)) and the regression as a whole between the groups.)
>
>
>
> *Note*
>
> I know that one way to test differences in significance between groups, is
> to use dummy variables of that group, in the regression. Yet, this is no
> option for my model because it only allows exogenous variables in the
> regression (and irrigation is an endogenous variable because the farmer can
> decide himself if he irrigates or not).
>
>
>
> Thank you very much in advance! I really appreciate your help!
>
>
> Janka
>
>
> P Please consider the environment before printing this e-mail
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list