[R] Using 'sapply' and 'by' in one function

Gabor Grothendieck ggrothendieck at gmail.com
Sun Feb 10 14:43:37 CET 2008


By passing new to fxa via the second argument of fxa, new is not being
subsetted hence the error.  Try this:

by(new, new$sex, function(x) sapply(x[1:2], function(y) coef(lm(y ~ Pred, x)))

Actually, you can do the above without sapply as lm can take a matrix
for the dependent variable:

by(new, new$sex, function(x) coef(lm(as.matrix(x[1:2]) ~ Pred, x)))

On Feb 10, 2008 8:19 AM, David & Natalia <3.14david at gmail.com> wrote:
> Greetings,
>
> I'm having a problem with something that I think is very simple - I'd
> like to be able to use the 'sapply' and 'by' functions in 1 function
> to be able (for example) to get regression coefficients from multiple
> models by a grouping variable.  I think that I'm missing something
> that is probably obvious to experienced users.
>
> Here's a simple (trivial) example of what I'd like to do:
>
> new <- data.frame(Outcome.1=rnorm(10),Outcome.2=rnorm(10),sex=rep(0:1,5),Pred=rnorm(10))
> fxa <- function(x,data)   { lm(x~Pred,data=data)$coef }
> sapply(new[,1:2],fxa,new)  # this yields coefficients for the
> predictor in separate models
>
> fxb <- function(x)   {lm(Outcome.1~Pred,da=x)$coef};
> by(new,new$sex,fxb) #yields the coefficient for Outcome.1 for each sex
>
> ## I'd like to be able to combine 'sapply' and 'by' to be able to get
> the regression coefficients for Outome.1 and Outcome.2 by each sex,
> rather than running fxb a second time predicting 'Outcome.2' or by
> subsetting the data - by sex - before I run the function, but the
> following doesn't work -
>
> by(new,new$sex,FUN=function(x)sapply(x[,1:2],fxa,new))
> 'Error in model.frame.default(formula = x ~ Pred, data = data,
> drop.unused.levels = TRUE) :
>  variable lengths differ (found for 'Pred')'
>
> ##I understand the error message - the length of 'Pred' is 10 while
> the length of each sex group is 5, but I'm not sure how to correctly
> write the 'by' function to use 'sapply' inside it.   Could someone
> please point me in the right direction?  Thanks very much in advance
>
> David S Freedman, CDC (Atlanta USA) [definitely not the well-know
> statistician, David A Freedman, in Berkeley]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list