[R] Loops (run the same function per different columns)

Thu Apr 24 03:45:59 CEST 2014

HI,
I guess you got an output like this using my script:
##Please use ?dput() to show the example data.

FA <- structure(list(Sample = c("L1 Control", "L1 Control", "L1 Control", 
"BBM Control", "BBM Control", "BBM Control", "L1 Ash", "L1 Ash", 
"L1 Ash", "BBM Ash", "BBM Ash", "BBM Ash"), C14.0 = c(0.456509192, 
0.513989684, 0.555894496, 0.418392781, 0.405826292, 0.398633968, 
0.504528078, 0.548667997, 0.499237645, 0.380582244, 0.395617943, 
0.389027115), C15.0 = c(0.469562687, 0.527958026, 0.502389699, 
0.385119329, 0.368564514, 0.391851493, 0.479125577, 0.517533922, 
0.490619858, 0.380051535, 0.384498216, 0.370815474), C15.1 = c(0.774909216, 
0.732083085, 0.706407924, 1.318261983, 1.114889958, 1.238411437, 
0.793236101, 0.632962545, 0.74858627, 0.996870831, 0.963780759, 
0.923329859)), .Names = c("Sample", "C14.0", "C15.0", "C15.1"
), class = "data.frame", row.names = c(NA, -12L))

library(gvlma)
y <- names(FA)[-1]
 y
#[1] "C14.0" "C15.0" "C15.1"

lst1 <- setNames(vector("list", length(y)),y)

 for(i in y){
 lst1[[i]] <- gvlma(lm(get(i)~Sample,data=FA))
 lst1} 

lst1[[1]]
#
#Call:
#lm(formula = get(y[i]) ~ Sample, data = FA)
#---------------------------------------

But, you wanted to show each of the list output as in gvlmaFA.

gvlmaFA <- gvlma(lm(C14.0~Sample,data=FA))

In my previous script, I didn't name the list.  Here, by setting the names as in "y", it could be easier.  I guess you wanted to reflect that in the model formula as well.

lst2 <- setNames(vector("list", length(y)), y)
for(names in y){
lst2[[names]] <- eval(bquote(gvlma(lm(.(names1)~ Sample, data=FA)), list(names1=as.name(names))))
lst2}

identical(gvlmaFA, lst2[[1]])
#[1] TRUE

A.K.

Hi Arun,
Your script works but it does not do what I was after. To be a bit more specific,  this the table FA in which Im working on ( but the original one has 34 fatty acids instead of 3: C14.0, C15.0, and C15.1).

Sample

C14:0

C15:0

C15:1

L1 Control

0.456509192

0.469562687

0.774909216

L1 Control

0.513989684

0.527958026

0.732083085

L1 Control

0.555894496

0.502389699

0.706407924

BBM Control

0.418392781

0.385119329

1.318261983

BBM Control

0.405826292

0.368564514

1.114889958

BBM Control

0.398633968

0.391851493

1.238411437

L1 Ash

0.504528078

0.479125577

0.793236101

L1 Ash

0.548667997

0.517533922

0.632962545

L1 Ash

0.499237645

0.490619858

0.74858627

BBM Ash

0.380582244

0.380051535

0.996870831

BBM Ash

0.395617943

0.384498216

0.963780759

BBM Ash

0.389027115

0.370815474

0.923329859

I just want to run the following script but with C15.0, C15.1 and the other 32 so I can quickly scroll up and down to see who does not meet the assumptions.

FA.ml=lm(C14.0~Sample,data=FA)
gvlmaFA<-gvlma(FA.ml)
gvlmaFA

This is the result when I run the script

Call:
lm(formula = C14.0 ~ Sample, data = FA)

Coefficients:
      (Intercept)  SampleBBM Control       SampleL1 Ash   SampleL1 Control  
          0.38841            0.01921            0.12907            0.12039  

ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS
USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:
Level of Significance =  0.05 

Call:
 gvlma(x = FA.ml) 

                       Value p-value                   Decision
Global Stat        4.757e+00 0.31312    Assumptions acceptable.
Skewness           1.944e-02 0.88911    Assumptions acceptable.
Kurtosis           1.462e-01 0.70219    Assumptions acceptable.
Link Function      3.682e-16 1.00000    Assumptions acceptable.
Heteroscedasticity 4.592e+00 0.03213 Assumptions NOT satisfied!

I really appreciate if you can help me with this issue. This would be really useful for me since I have large tables of data.
Cheers

On Monday, April 21, 2014 9:19 AM, arun <smartpink111 at yahoo.com> wrote:
Hi,

Using the example data from library(gvlma)

library(gvlma)
data(CarMileageData)
CarMileageNew <- CarMileageData[,c(5,6,3)]
 lst1 <- list()
 y <- c("NumGallons", "NumDaysBetw")
 for(i in seq_along(y)){
 lst1[[i]] <- gvlma(lm(get(y[i])~MilesLastFill,data=CarMileageNew))
 lst1}
pdf("gvlmaplot.pdf")
 lapply(lst1,plot)
dev.off()

You could also use ?lapply().

A.K.

Hi
I have a spread sheet with a column Samples (column1) and then 34 more columns with different concentrations of fatty acids per sample. Im trying to run the same function 34 times. In this case (the first of 34), I have a fatty acid called C14.0 (column 2). I'm a newbie with R so I spent the last 4 days looking for a way of doing it (without running the same function 34 times with a different fatty acid each time). I saw that people do similar things with loops but I cannot get them to work.
I have tried the script below but it does not work.

y<-c("C14.0","C15.0","C16.0")
for (i in y) {
FA.ml=lm(i~Sample,data=FA)
gvlmaFA<-gvlma(FA.ml)
gvlmaFA
}

I really appreciate if someone can give me a hand with that. I know would have been finished if I had typed the 34 fatty acids but I want to learn how to do it with loops.
Cheers