[R] P values
Charles Annis, P.E.
charles.annis at statisticalengineering.com
Fri May 7 19:08:23 CEST 2010
Please let me quote an eminently sensible person, who observed that ...
"p-values are dangerous, especially large, small, and in-between ones."
- Frank E Harrell Jr., Prof. of Biostatistics and Department Chair,
Charles Annis, P.E.
Charles.Annis at StatisticalEngineering.com
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Robert A LaBudde
Sent: Friday, May 07, 2010 12:29 PM
To: Duncan Murdoch
Cc: r-help at r-project.org; level
Subject: Re: [R] P values
At 07:10 AM 5/7/2010, Duncan Murdoch wrote:
>Robert A LaBudde wrote:
>>At 01:40 PM 5/6/2010, Joris Meys wrote:
>>>On Thu, May 6, 2010 at 6:09 PM, Greg Snow <Greg.Snow at imail.org> wrote:
>>>>Because if you use the sample standard deviation then it is a t test not
>>>I'm doubting that seriously...
>>>You calculate normalized Z-values by substracting the sample mean and
>>>dividing by the sample sd. So Thomas is correct. It becomes a Z-test
>>>you compare these normalized Z-values with the Z distribution, instead of
>>>the (more appropriate) T-distribution. The T-distribution is essentially
>>>Z-distribution that is corrected for the finite sample size. In
>>>the Z and T distribution are identical.
>>And it is only in Utopia that any P-value less than 0.01 actually
>>corresponds to reality.
>I'm not sure what you mean by this. P-values are simply statistics
>calculated from the data; why wouldn't they be real if they are small?
Do you truly believe an actual real-life distribution accurately is
fit by a normal distribution at quantiles of 0.001, 0.0001 or beyond?
"The map is not the territory", and just because you can calculate
something from a model doesn't mean it's true.
The real world is composed of mixture distributions, not pure ones.
The P-value may be real, but its reality is subordinate to the
distributional assumption involved, which always fails at some level.
I'm simply asserting that level is in the tails at probabilities of
0.01 or less.
Statisticians, even eminent ones such as yourself and lesser lights
such as myself, frequently fail to keep this in mind. We accept such
assumptions as "normality", "equal variances", etc., on an
"eyeballometric" basis, without any quantitative understanding of
what this means about limitations on inference, including P-values.
Inference in statistics is much cruder and more judgmental than we
like to portray. We should at least be honest among ourselves about
the degree to which our hand-waving assumptions work.
I remember at the O. J. Simpson trial, the DNA expert asserted that a
match would occur only once in 7 billion people. I wondered at the
time how you could evaluate such an assertion, given there were less
than 7 billion people on earth at the time.
When I was at a conference on optical disk memories when they were
being developed, I heard a talk about validating disk specifications
against production. One statement was that the company would also
validate the "undetectable error rate" specification of 1 in 10^16
bits. I amusingly asked how they planned to validate the
"undetectable" error rate. The response was handwaving and "Just as
we do everything else". The audience laughed, and the speaker didn't
seem to know what the joke was.
In both these cases the values were calculable, but that didn't mean
that they applied to reality.
Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: ral at lcfltd.com
Least Cost Formulations, Ltd. URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239 Fax: 757-467-2947
"Vere scire est per causas scire"
R-help at r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help