# [R] Same regression per sub-group: apply?

Romain Francois rfrancois at mango-solutions.com
Fri Dec 7 10:16:59 CET 2007

```What about ?by, something like this (still untested):

model.per.country <- by( data, data\$COUNTRY, function (x) {
glm(dependent.var ~ FEMALE + AGE + EDUCLIN, family = binomial, data = x)
})

Or

model.per.country <- by( data, data\$COUNTRY,
glm , dependent.var ~ FEMALE + AGE + EDUCLIN, family = binomial )

Dimitris Rizopoulos wrote:
> try something like this (untested):
>
> dataCountry <- split(data, data\$COUNTRY)
> model.per.country <- lapply(dataCountry, function (x) {
>     glm(dependent.var ~ FEMALE + AGE + EDUCLIN, family = binomial,
> data = x)
> })
>> Dear helpers,
>>
>> I've come up with what is probably a simple problem, but I cannot
>> find the solution. I have a data-set containing survey-data from
>> several countries. What I want to do is to perform some regression
>> analyses, for each country separately. The question is, how to do
>> this nicely (thus without repeating the same syntax with another
>> `subset' argument).
>>
>> I thought of the following:
>>
>> model.per.country <- tapply(data, data\$COUNTRY, function(x) glm
>> (dependent.var ~ FEMALE + AGE + EDUCLIN + (), family=binomial,
>> data=capital))
>>
>> But this does not work. What goes wrong, I think, is that the
>> dependent variable is clustered according to `Country', but not so
>> for the predictors. The error message I received:
>>
>> Error in tapply(dat, dat\$COUNTRY, function(x) glm(participate ~
>> FEMALE +  :
>> arguments must have same length
>>
>>
