[R] Automating R for Hypothesis Testing

meredith mmballar at mtu.edu
Thu May 10 16:40:36 CEST 2012


Rui-
  Thanks this definitely helps, just one quick question. How would you code
the values of chi-fm and chi-fms to change based on the degrees of freedom
of each model H(i)?

Meredith


Rui Barradas wrote
> 
> Hello,
> 
> Yes, it does help. Now we can see your data and what you're doing.
> What follows is a suggestion on what you could do, not full solution.
> (You forgot to say what X1 is, but I don't think it's important to
> understand the suggestion.)
> (If I'm wrong, say something.)
> 
> 
> milwaukeephos <- read.csv("milwaukeephos.csv", header=TRUE,
> stringsAsFactors=FALSE)
> # list of data.frames, one per month
> ls1 <- split(milwaukeephos, milwaukeephos$month)
> 
> #--------- if you want to keep the models, not needed if you don't.
> #          (yoy probably don't)
> modelH <- vector("list", 12)
> modelHa <- vector("list", 12)
> modelH2 <- vector("list", 12)
> modelH2a <- vector("list", 12)
> #--------- values to record, these are needed, create them beforehand.
> chi_fm <- numeric(12)
> chi_fms <- numeric(12)
> #
> seq_months <- c(1:12, 1) # wrap months around.
> for(i in 1:12){
> 	month_this <- seq_months[i]
> 	month_next <- seq_months[i + 1]
> 
> 	lload <- c(ls1[[month_this]]$load_kg, ls1[[month_next]]$load_kg)
> 	lflow <- c(ls1[[month_this]]$flow, ls1[[month_next]]$flow)
> 	modelH[[i]] <- lm(lload ~ lflow)
> 	# If you don't want to keep the models, use modelH only
> 	# ( without [[i]] )
> 	# and do the same with X1
> 
> 	# rest of your code for first test goes here
> 	chi_fm[i] <- bfm %*% var_fm %*% (bunres_fm - bres_fm)
> 
> 	# and the same for the second test
> 	chi_fms[i] <- ...etc...
> }
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> 
> meredith wrote
>> 
>> dput:  http://r.789695.n4.nabble.com/file/n4620188/milwaukeephos.csv
>> milwaukeephos.csv 
>> 
>> # Feb-march
>>> modelH_febmarch<-lm(llfeb_march~lffeb_march)
>>>modelHa_febmarch<-lm(llfeb_march~X1feb_mar+lffeb_march)
>>> anova(modelHa_febmarch)
>>> coefficients(modelH_febmarch)
>> (Intercept) lffeb_march 
>>   -2.429890    1.172821 
>>> coefficients(modelHa_febmarch)
>> (Intercept)   X1feb_mar lffeb_march 
>>  -2.8957776  -0.5272793   1.3016303 
>>> bres_fm<-matrix(c(-2.429890,0,1.172821),nrow=3)
>>> bunres_fm<-matrix(c(-2.8957776,-0.5272793,1.3016303),nrow=3)
>>>bfm<-t(bunres_fm-bres_fm)
>>> fmvect<-seq(1,1,length=34)
>>> X1a_febmar<-seq(0,0,length=9) # dummy variable step 1
>>> X1b_febmar<-seq(1,1,length=25) # dummy variable step 2
>>> X1feb_mar<-c(X1a_febmar,X1b_febmar) #dummy variable creation
>> # Test Stat Equation for Chisq
>>> fmxx<-cbind(fmvect,X1feb_mar,lffeb_march)
>>> tfmx<-t(fmxx)
>>> xcom_fm<-(tfmx %*% fmxx)
>>> xinv_fm<-ginv(xcom_fm)
>>> var_fm<-xinv_fm*0.307
>>> chi_fm<-bfm %*% var_fm %*% (bunres_fm-bres_fm)
>>> chi_fm # chisq value for recording
>> if less than CV move onto to slope modification
>>> modelH2_febmarch<-lm(llfeb_march~X3feb_march)
>>> modelH2a_febmarch<-lm(llfeb_march~X3feb_march+X4feb_march)
>>> anova(modelH2a_febmarch)
>>> coefficients(modelH2_febmarch) # get coefficients to make beta vectors
>>> for test
>> (Intercept) X3feb_march 
>>    5.342130    1.172821 
>>> coefficients(modelH2a_febmarch)
>> (Intercept) X3feb_march X4feb_march 
>>   5.2936263   1.0353752   0.2407557 
>> # Test Stat
>>> bsres_fm<-matrix(c(5.342130,1.172821,0),nrow=3)
>>> bsunres_fm<-matrix(c(5.2936263,1.0353752,0.2407557),nrow=3)
>>> bsfm<-t(bsunres_fm-bsres_fm)
>>> #X matrix
>>> fmxs<-cbind(fmvect,X3feb_march,X4feb_march)
>>> tfmxs<-t(fmxs)
>>> xcoms_fm<-(tfmxs %*% fmxs)
>>> xinvs_fm<-ginv(xcoms_fm)
>>> var_fms<-xinvs_fm*0.341
>>> chi_fms<-bsfm %*% var_fms %*% (bsunres_fm-bsres_fm)
>>> chi_fms
>> # Record Chisq value
>> 
>> Does this help?
>> Here lffeb_march is the combination of Feb and March log flows
>> and llfeb_march is the combination of Feb and March log loads
>> X3: lffeb_march-mean(feb_march)
>> X4: X1*X3
>> 
>> Thanks
>> 
>> Rui Barradas wrote
>>> 
>>> Hello,
>>> 
>>> I'm not at all sure if I understand your problem. Does this describe it?
>>> 
>>> 
>>> test first model for months 1 and 2
>>> if test statistic less than critical value{
>>> 	test second model for months 1 and 2
>>> 	print results of the first and second tests? just one of them?
>>> }
>>> move on to months 2 and 3
>>> etc, until months 12 and 1
>>> 
>>> 
>>> Please post example data using dput(dataset).
>>> Just copy it's output and paste it in your post.
>>> 
>>> And example code, what you're already doing.
>>> (Possibly simplified)
>>> 
>>> Rui Barradas
>>> 
>>> 
>>> meredith wrote
>>>> 
>>>> R Users-
>>>>   I have been trying to automate a manual code that I have developed
>>>> for calling in a .csv file, isolating certain rows and columns that
>>>> correspond to specified months:
>>>> something to the effect
>>>> i=name.csv
>>>> N=length(i$month)
>>>> iphos1=0
>>>> iphos2=0
>>>> isphos3=0
>>>> for i=1,N
>>>>  if month=1
>>>>     iphos1=iphos+1
>>>>     iphos1(iphos1)=i
>>>> 
>>>> an so on to call out the months into there own arrays (unless there is
>>>> a way I can wrap it into the next automation)
>>>> 
>>>> Next: I would like to run a simple linear regression combining each of
>>>> the months 1 by 1:
>>>> for instance I want to run a regression on a combined model from months
>>>> 1 and 2 and a dummy model for 1 and 2, compare them using a Chi-sq
>>>> distribution, if Chi-sq is less than the Critical value, we accept and
>>>> go on to test another set of models with both 1 and 2. If it rejects,
>>>> then we proceed to months 2 and 3.  If we move on to the second set on
>>>> months 1 and 2, and the critical value is accepted, I want to print an
>>>> accept or reject and move on to months 2 and 3, until finally comparing
>>>> months 12-1 at the end.
>>>> Is there a way to loop or automate this in R?
>>>> 
>>>> Thanks
>>>> Meredith
>>>> 
>>> 
>> 
>> Rui Barradas wrote
>>> 
>>> Hello,
>>> 
>>> I'm not at all sure if I understand your problem. Does this describe it?
>>> 
>>> 
>>> test first model for months 1 and 2
>>> if test statistic less than critical value{
>>> 	test second model for months 1 and 2
>>> 	print results of the first and second tests? just one of them?
>>> }
>>> move on to months 2 and 3
>>> etc, until months 12 and 1
>>> 
>>> 
>>> Please post example data using dput(dataset).
>>> Just copy it's output and paste it in your post.
>>> 
>>> And example code, what you're already doing.
>>> (Possibly simplified)
>>> 
>>> Rui Barradas
>>> 
>>> 
>>> meredith wrote
>>>> 
>>>> R Users-
>>>>   I have been trying to automate a manual code that I have developed
>>>> for calling in a .csv file, isolating certain rows and columns that
>>>> correspond to specified months:
>>>> something to the effect
>>>> i=name.csv
>>>> N=length(i$month)
>>>> iphos1=0
>>>> iphos2=0
>>>> isphos3=0
>>>> for i=1,N
>>>>  if month=1
>>>>     iphos1=iphos+1
>>>>     iphos1(iphos1)=i
>>>> 
>>>> an so on to call out the months into there own arrays (unless there is
>>>> a way I can wrap it into the next automation)
>>>> 
>>>> Next: I would like to run a simple linear regression combining each of
>>>> the months 1 by 1:
>>>> for instance I want to run a regression on a combined model from months
>>>> 1 and 2 and a dummy model for 1 and 2, compare them using a Chi-sq
>>>> distribution, if Chi-sq is less than the Critical value, we accept and
>>>> go on to test another set of models with both 1 and 2. If it rejects,
>>>> then we proceed to months 2 and 3.  If we move on to the second set on
>>>> months 1 and 2, and the critical value is accepted, I want to print an
>>>> accept or reject and move on to months 2 and 3, until finally comparing
>>>> months 12-1 at the end.
>>>> Is there a way to loop or automate this in R?
>>>> 
>>>> Thanks
>>>> Meredith
>>>> 
>>> 
>> 
> 


--
View this message in context: http://r.789695.n4.nabble.com/Automating-R-for-Hypothesis-Testing-tp4618653p4623689.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list