[R] Inference for R Spam

Rolf Turner r.turner at auckland.ac.nz
Thu Mar 5 00:43:45 CET 2009


On 5/03/2009, at 12:13 PM, Bert Gunter wrote:

>
> "The purpose of the subject or discipline ``statistics'' is in essence
> to answer the question ``could the phenomenon we observed have arisen
> simply by chance?'', or to quantify the *uncertainty* in any estimate
> that we make of a quantity."
>
>
> May I take strong issue with this characterization? It is far too  
> narrow and
> constraining. We are scientists first and foremost. The most  
> important and
> useful thing I do is to collaborate with other scientists to frame  
> good
> questions, design good experiments and studies, and gain insight  
> into the
> results of those experiments and studies (usually via graphical  
> displays,
> for which R is superbly suited). Blessing data with P-values is  
> rarely of
> much importance, and is often frankly irrelevant and even  
> misleading (but
> that's another rant).
>
> George Box said this much better than I: "The business of the  
> statistician
> is to catalyze the scientific learning process."
>
> This is much much more than you intimate.

I must respectfully disagree.  Far be it from me to argue with George  
Box,
but nevertheless ... it may be statisticians *business* to catalyze the
scientific learning process, but that is the business of *any*  
scientist.
What we bring to the process is our understanding of the essentials of
statistics, just as the chemist brings her understanding of the  
essentials
of chemistry and the biologist her understanding of the essentials of
biology.

The essentials of statistics consist in answering the question of  
``could
this phenomenon have arisen by chance?''  This is where we contribute  
in a
way that other scientists do not.  They don't understand variability,  
the
poor dears.  (Unless they have been well taught and thereby have become
in part statisticians themselves.) They have a devastating tendency  
to treat
an estimated regression line as *the* regression line, the truth.   
And so on.

The *way* we address the question of ``could it have happened by  
chance''
and the way we address the problem of quantifying variability is  
indeed open
to a broad range of techniques including graphics.

Note that I did not say word one about p-values.  The example I gave was
a scientific question --- is there a difference in the home field  
advantage
between the English Premier Division and the equivalent division or  
league
in Italy?  How much of a difference?  You may wish to throw in a p- 
value,
or you may not.  You will probably wish to look at a confidence  
interval.
You may wish to look at the question from the point of view of the  
distribution
of (home) - (away) differences, in which case graphics will most  
certainly
help.  But it comes down to answering the basic question.  If you  
have no
ability to answer such questions you are not, or might as well not be, a
statistician.

	cheers,

		Rolf Turner


######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}




More information about the R-help mailing list