[R] p-values

Wed Apr 28 14:31:03 CEST 2004

The Bayesian framework is surely a good framework
for thinking about inference, and for exploring common
misinterpretations of p-values.  P-values are surely
unhelpful and to be avoided in cases where there is
`strong' prior evidence.  I will couch the discussion that
follows in terms of confidence intervals, which makes
the discussion simpler, rather than in terms of p-values.

The prior evidence is in my sense strong if it leads to a
Bayesian credible interval that is very substantially
different from the frequentist confidence interval
(though I prefer the term `coverage interval').
Typically the intervals will be similar if a "diffuse" prior
is used, i.e., all values over a wide enough range are,
on some suitable scale, a-priori equally likely.  This is,
in my view, the message that you should take from your
reading.

Examples of non-diffuse priors are what Berger focuses on.
Consider for example his discussion of one of Jeffreys'
analyses, where Jeffreys puts 50% of the probability on
on a point value of a a continuous parameter, i.e., there is
a large spike in the prior at that point.

Berger commonly has scant commentary on the specific
features of his priors that make the Bayesian results seem
very different (at least to the extent of having a different "feel")
from the frequentist results. His paper in vol 18, no 1 of
Statistical Science (pp.1-32; pp.12-27 are comment from
other) seems more judicious in this respect than some of
his earlier papers.

It is interesting to speculate how R's model fitting routines
might be tuned to allow a Bayesian interpretation.  What
family or families of priors would be on offer, and/or used by
default? What default mechanisms would be suitable &
useful for indicating the sensitivity of results to the choice
of prior?

John Maindonald.

> From: Greg Tarpinian <sasprog474 at yahoo.com>
> Date: 28 April 2004 6:32:06 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] p-values
>
>
> I apologize if this question is not completely
> appropriate for this list.
>
> I have been using SAS for a while and am now in the
> process of learning some C and R as a part of my
> graduate studies.  All of the statistical packages I
> have used generally yield p-values as a default output
> to standard procedures.
>
> This week I have been reading "Testing Precise
> Hypotheses" by J.O. Berger & Mohan Delampady,
> Statistical Science, Vol. 2, No. 3, 317-355 and
> "Bayesian Analysis: A Look at Today and Thoughts of
> Tomorrow" by J.O. Berger, JASA, Vol. 95, No. 452, p.
> 1269 - 1276, both as supplements to my Math Stat.
> course.
>
> It appears, based on these articles, that p-values are
> more or less useless.  If this is indeed the case,
> then why is a p-value typically given as a default
> output?  For example, I know that PROC MIXED and
> lme( ) both yield p-values for fixed effects terms.
>
> The theory I am learning does not seem to match what
> is commonly available in the software, and I am just
> wondering why.
>
> Thanks,
>     Greg

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Bioinformation Science, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.