[R] statistics - hypothesis testing question

Thu Sep 13 23:19:09 CEST 2007

On Thu, 13 Sep 2007, Leeds, Mark (IED) wrote:

> you're right Duncan. My bad. That was kind of dopey because R squared is
> a statistic in itself.  They aren't nested models because
> the two predictors are different and there are no other predictors.  I'm
> trying  to see whether the model B predictor is "better" than  the
> Model A predictor. I guess how one defines "better" is th real question
> so I apologize for that. Still, any comments, suggestions are welcome.

See

Bradley Efron Comparing non-nested linear models J. of the Amer. Stat'l. 
Assn. 79 791-803 1984, and Google Scholar it to get newer refs.

and

Williams, E. Regression Analysis, Wiley, New York, 1959.

Williams test is remarkably simple.

IIRC, take x1 and x2 as candidate predictors of y

> sx1 <- scale(x1)
> sx2 <- scale(x2)
> anova( lm( y ~ I(sx1+sx2) + I(sx1-sx2) ) )

The second test is the relevant one.

How you roll 24 of these into one omnibus test will depend on the joint 
error structure. Are the errors independent? Are x1 and x2 fixed over all 
24 weeks or do you get fresh observations each time?

HTH,

Chuck

>
>
> -----Original Message-----
> From: Duncan Murdoch [mailto:murdoch at stats.uwo.ca]
> Sent: Thursday, September 13, 2007 2:32 PM
> To: Leeds, Mark (IED)
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] statistics - hypothesis testing question
>
> On 9/13/2007 2:18 PM, Leeds, Mark (IED) wrote:
>> I estimate two competing simple regression models, A and B where the
>> LHS is the same in both cases but the predictor is different ( I
>> handle the intercept issue based on other postings I have seen ). I
>> estimate the two models on a weekly basis over 24 weeks.
>> So, I end up with 24 RSquaredAs and 24 RsquaredBs, so essentally 2
>> time series of Rsquareds. This doesn't have to be necessarily thought
>> of as a time series problem but, is there a usual way, given the
>> Rsquared data, to test
>>
>> H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A
>>
>> so that I can map the 24 R squared numbers into 1 statistic. Maybe
>> that's somehow equivalent to just running 2 big regressions over the
>> whole 24 weeks and then calculating a statistic from those based on
>> those regressions ?
>
> The question doesn't make sense, if you're using standard notation.  R^2
> is a statistic, not a parameter, so one wouldn't test copies of it for
> equality.
>
> You can probably reframe the question in terms of E(R^2) so the
> statement parses, but then it doesn't really make sense from a subject
> matter point of view:  unless model A is nested within model B, why
> would you ever expect the two fits to explain exactly the same amount of
> variation?
>
> If model A is really a special case of model B, then you're back to the
> standard hypothesis testing situation, but repeated 24 times.  There's a
> lot of literature on how to handle such multiple testing problems,
> depending on what sort of alternatives you want to detect.  (E.g. do you
> think all 24 cases will be identical, or is it possible that 23 will
> match but one doesn't?)
>
> Duncan Murdoch
>
>>
>> I broke things up into 24 weeks because I was thinking that the
>> stability of the performance difference of the two models could be
>> examined over time. Essentially these are simple time series
>> regressions X_t = B*X_t-1 + epsilon so I always need to consider
>> whether any type of behavior is stable.  But now I am thinking that,
>> if I just want one overall number,  then maybe I should be considering
>
>> all the data simultaneously ?
>>
>> In a nutshell,  I am looking for any suggestions on the best way to
>> test whether Model B is better than Model A where
>>
>> Model A :  X_t = Beta*X_t-1 + epsilon
>>
>> Model B :  X_t = Betastar*Xstar_t-1 + epsilonstar
>>
>>
>> Thanks fo your help.
>> --------------------------------------------------------
>>
>> This is not an offer (or solicitation of an offer) to
>> buy/se...{{dropped}}
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> --------------------------------------------------------
>
> This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901