[R] Automatic saving of many regression's output

arun smartpink111 at yahoo.com
Wed Nov 27 23:37:51 CET 2013


Hi,

lst1[[1]][,2] <- NA
lst2 <- lapply(lst1,function(x) summary(lm(rate~.,data=x)))
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  0 (non-NA) cases



lst2 <- lapply(lst1[sapply(lst1,function(x) !(all(rowSums(is.na(x))>0)))],function(x) summary(lm(rate~.,data=x)) )
A.K.



Hi,

thank you for help. :-)

I applied your script to the data but I have got the error:

Error
 in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :  0 
(non-NA) casesI forget to write that some of the data are NA.

I executed this code:

lst1 <- split(data[,-16],data[,16])
>any(sapply(lst1,nrow)!=123)
>#[1] FALSE
>lst2
 <- lapply(lst1,function(x) 
summary(lm(rate~cap.log+liqamih.log+pbv,data=x))) # here I can set the 
dependent variables if I  want to test different versions of the model 
(e.g with only e dependent variables), right?
>length(lst2)
>#[1] 334
>




On Wednesday, November 27, 2013 5:27 PM, arun <smartpink111 at yahoo.com> wrote:
Hi,
Try:
set.seed(49)
dat1 <- as.data.frame(matrix(sample(c(NA,1:50),41082*15,replace=TRUE),ncol=15))
 dat1$indx <- as.numeric(gl(334*123,123,334*123))
names(dat1)[1] <- "rate"
 lst1 <- split(dat1[,-16],dat1[,16])
any(sapply(lst1,nrow)!=123)
#[1] FALSE
lst2 <- lapply(lst1,function(x) summary(lm(rate~.,data=x)))
 length(lst2)
#[1] 334

A.K.

Hi all! 

I am very beginner in R so please excuse me some of the naive questions. I am learning. 
Here is description of my problem: 

I have database (in single csv file) 
                   characteristic_1    characteristic_2               ...          characteristic_49 
subject_1     |      c1_1_t=1             |   c2_1_t=1             ... |     c49_1_t=1 
subject_2     |      c1_2_t=1             |   c2_2_t=1             ... |     c49_2_t=1 
subject_3     |      c1_3_t=1             |   c2_3_t=1             ... |     c49_3_t=1 
... 
subject_334  |      c1_334_t=1         |   c2_334_t=1          ... |     c49_334_t=1 
subject_1     |      c1_1_t=2            |   c2_1_t=2              ... |     c49_1_t=2 
subject_2     |      c1_2_t=2            |   c2_2_t=2              ... |     c49_2_t=2 
subject_3     |      c1_3_t=2            |   c2_3_t=2              ... |     c49_3_t=2 
... 
subject_334  |      c1_3_t=2            |   c2_3_t=2              ... |     c49_3_t=2 

and so on ... till t (time) = 123 

so I have 334 subjects with 49 characteristics measured in 123 points of time. 

I would like to run 123 regressions (three kinds: lm, rlm and 
lmrob - for comparison reasons) each one for 334 subjects and 49 
dependent variables and after each regression (actually after conducting
each of the three regressions:lm, rlm and lmrob) I would like to save 
txt (or csv) file with results (summary) and some test* (each regression
can be named reg_1, reg_2 ... reg_123) for those regressions. 

To make things more clear: 
regressions would look like that: 

summary(lm(rate~cap.log+liqamih.log+liqwol.log+pbv.log+mom.log+ 
             +beta.wig+beta.wig.eq 
           +beta.sp 
           +beta.wig.macro 
           +beta.sp.macro 
           +beta.sentim.pl+beta.sentim.pl.ort 
           +beta.sentim.usa+beta.sentim.usa.ort, data=data)) 

the problem is how to make this lm() above for "rolling window" 
id est for first 334 observations? (total observations: 123*334) and so 
on. 
I need to run regression_1 for first 334 observations, regression_2 
for next 334 obs (from 335 to 669) and so on till regression_123 (from 
last 40748 till 41082). 
And each time I run such regression I would like to save results (summary and mentioned tests). 

Then I would like to repeat the same procedure but for rlm() and lmrob() if possible. 

I think I can write "tests" part of the script alone (could you 
write me some comments where exactly I should put it in script to have 
the test automatically repeated the results saved), but 'saving' and 
'repeating 123 times' procedures are quite complicated for me, at least 
now. So here I am asking for help with it. 

In the end I would like to have three txt (or csv) files: 
one containing 123 "summaries" and test results of lm, 
one containing 123 "summaries" and test results of rlm 
and one containing 123 "summaries" and test results of lmrob. 

Could someone help me with this task? 
I am grateful for your help and support. 

________________ 
*like: 
jarque.bera.test() 
vif() 
ncvTest() 
durbinWatsonTest() 

---some of them are not applicable for rlm and lmrob - so in 
this case I would like to have "test NA" in the three output txt (or 
csv) files 
Some of them are also not applicable to cross-sectional regressions 
... but still I would like to keep them in script for later 
modifications



More information about the R-help mailing list