[R] regression analysis in R

arun smartpink111 at yahoo.com
Fri Oct 26 23:47:00 CEST 2012


HI,
May be this helps.
set.seed(8)
mat1<-matrix(sample(150,90,replace=FALSE),ncol=9,nrow=10)
dat1<-data.frame(mat1)
set.seed(10)
B<-sample(150:190,10,replace=FALSE)

res1<-lapply(dat1,function(x) lm(B~as.matrix(x)))
#or
res1<-lapply(dat1,function(x) lm(B~x))

res1Summary<-lapply(res1,summary)
#to get the coefficients
res1SummaryCoef<-lapply(res1,function(x) summary(x)$coefficients)
res1SummaryCoef[1:3]
#$X1
#                Estimate Std. Error   t value     Pr(>|t|)
#(Intercept)  150.1303702 8.45536736 17.755630 1.035959e-07
#as.matrix(x)   0.2126583 0.09304937  2.285436 5.163141e-02
#
#$X2
#                  Estimate Std. Error     t value     Pr(>|t|)
#(Intercept)  168.219302287  6.9904434 24.06418202 9.479720e-09
#as.matrix(x)  -0.002386046  0.1146838 -0.02080544 9.839104e-01
#
#$X3
#               Estimate Std. Error   t value     Pr(>|t|)
#(Intercept)  180.303999  8.6675156 20.802270 2.990115e-08
#as.matrix(x)  -0.157268  0.1021179 -1.540064 1.621101e-01


#to get pvalue of Fstatistic
res1pvalueF<-lapply(res1,function(x) pf(summary(x)$fstatistic[1],summary(x)$fstatistic[2],summary(x)$fstatistic[3],lower.tail=FALSE))
#to get r.squared value
res1rSquare<-lapply(res1,function(x) summary(x)$r.squared)
 
#2nd part 
#Create some new datasets using random combination of columns from dat1
dat2<-dat1[,sample(names(dat1),4)]
 dat3<-dat1[,sample(names(dat1),4)]
 dat4<-dat1[,sample(names(dat1),4)]
 dat5<-dat1[,sample(names(dat1),4)]
 dat6<-dat1[,sample(names(dat1),4)]
head(dat2)
#  X7  X3  X8  X5
#1 85  30 113 100
#2 89  53 115  32
#3 74  79  63  54
#4 57  28  52  94
#5  6  84 135 132
#6  5 123 146 127
 head(dat3)
#   X8  X2  X6  X3
#1 113  64  14  30
#2 115  13   7  53
#3  63  60  15  79
#4  52  75  34  28
#5 135  19 107  84
#6 146 126  27 123

#create a list of dataframes
list1<-list(dat2,dat3,dat4,dat5,dat6)
res2<-lapply(list1,function(x) lm(B~as.matrix(x)))
res2rSquare<-lapply(res2,function(x) summary(x)$r.squared)
unlist(res2rSquare)
#[1] 0.8444332 0.6316695 0.6971695 0.7322519 0.4328805

For selection of the best model based on combination of descriptors, you can also look for step-wise elimination, or based on AIC or BIC values.

A.K.







----- Original Message -----
From: eliza botto <eliza_botto at hotmail.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc: 
Sent: Friday, October 26, 2012 4:00 PM
Subject: [R] regression analysis in R


Dear useRs,
i have vectors of about 27 descriptors, each having 703 elements. what i want to do is the following 1. i want to do regression analysis of these 27 vectors individually, against a dependent vector, say B, having same number of elements.2. i would like to know best 10 regression results, if i do regression analysis of dependent vector against the random combination of any 4 descriptors. more precisely, in the first step we did regression of dependent vector against individual vector of each descriptor, but now we want R to randomly combine descriptors in a set of 4 and does regression analysis with B to see what are top 10 combination of descriptors giving good regression results with B? i hope i am clear. i know 2nd part is more tricky, but i will be extremely happy if you can answer any one of the above questions.
thanks in advanceeliza
                          
    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list