[R] two newbie questions

Donald Braman dbraman at law.gwu.edu
Mon Jun 23 00:26:33 CEST 2008


# I've tried to make this easy to paste into R, though it's probably
so simple you won't need to.
# I have some data (there are many more variables, but this is a
reasonable approximation of it)

# here's a fabricated data frame that is similar in form to mine:
my.df <- data.frame(replicate(10, round(rnorm(100, mean=3.5, sd=1))))
var.list <- c("dv1", "dv2", "dv3", "iv1", "iv2", "iv3", "iv4", "iv5",
"intv1", "intv2")
names(my.df) <- var.list

# I have some are DVs:
dvs <- c("dv1", "dv2", "dv3")

# some IVs:
ivs <- c("iv1", "iv2", "iv3", "iv4", "iv5")

# and some binary interaction variables:
intvs <- c("intv1", "intv2")
library(car)
my.df[intvs] <- lapply(my.df[intvs], function(x)
 recode(x, recodes = "lo:3.5=0; 3.5:hi=1; ",as.factor.result = FALSE))

# now I loop through a series of interactions using the vector numbers:
for(dv in 1:3) {
 for(iv in 4:8) {
  for (intv in 9:10) {
   jpeg(paste(names(my.df[iv]), names(my.df[dv]), names(my.df[intv]),
".jpg", sep="_"))
   with(data.frame(my.df), {
    my.fit <- lm( my.df[[dv]] ~ my.df[[iv]] + my.df[[intv]] +
my.df[[iv]]:my.df[[intv]])
    colors <- ifelse (my.df[[intv]] == 1, "black", "grey")
    plot(my.df[[iv]], my.df[[dv]], xlab=names(my.df[iv]),
ylab=names(my.df[dv]), col=colors, pch=".")
    curve (cbind (1, 1, x, 1*x) %*% coef(my.fit), add=TRUE, col="black")
    curve (cbind (1, 0, x, 0*x) %*% coef(my.fit), add=TRUE, col="gray")
   })
   dev.off()
  }
 }
}


# Question1: Works fine, but using the vector numbers feels kludgy --
especially if the variables in question aren't consecutive.
# Is there a more elegant way of doing this with my lists of variable
names? Something like this, for example:
for(dv in dvs) {
 for(iv in ivs) {
  for (intv in intvs) {
   jpeg(paste(dv, iv, intv, ".jpg", sep="_"))
   with(data.frame(my.df), {
    my.fit <- lm(my.df[dv] ~ my.df[iv] + my.df[intv] + my.df[iv]:my.df[intv])
    colors <- ifelse (my.df[[intv]] == 1, "black", "grey")
    plot(my.df[iv], my.df[dv], xlab=iv, ylab=names(dv), col=colors, pch=".")
    curve (cbind (1, 1, x, 1*x) %*% coef(my.fit), add=TRUE, col="black")
    curve (cbind (1, 0, x, 0*x) %*% coef(my.fit), add=TRUE, col="gray")
   })
   dev.off()
  }
 }
}

# Clearly that's wrong -- why it's wrong is obscure to me, though!
Please educate me!

# Question2: Could this could be done by using "apply" rather than a loop?
# Or is looping better here bc there are several actions performed at
each iteration?
# I'm still trying to get my head around all the ways to ditch looping in R.


Donald Braman
http://www.law.gwu.edu/Faculty/profile.aspx?id=10123
http://research.yale.edu/culturalcognition
http://ssrn.com/author=286206



More information about the R-help mailing list