[R] Off Topic: Statistical "philosophy" rant

Thu Jan 13 03:45:42 CET 2005

I have often noted that "statistics can't prove a damn thing, but they can be really useful in disproving something." Having spent most of 80s and half of the 90s with the Australian Bureau of Statistics to find out how you collect these numbers, I am disconcerted at the apparent disregard for measurement issues such as bias, input error, questionnaire design etc etc. ... Science wars ... the real world ... and the not so real world. Having only recently discovered what our esteemed J Baron does I should say that a lot of his work requires us to ask how we use (abuse?) the tools we have.

Having said that some of my most influential work has come from data exploration within fields where I would describe myself as a complete novice. Using ony the phrase "the data seems to indicate" realtionship x with y or some variant and asking if this is an accepted norm has produced some unexpected paradigm shifts.

Someone on the list has a footline of something along the lines of "All models are wrong, but some of them are useful." I think this is attributed to Box. As most of us know some of the advice on this list has more sage than others.

That all concludes to say the manner in which we deal with non-model uncertainty, impacts upon the degree to which we perform a disservice to science/ourselves. I think you are being unduly pessimistic, but then again I might just be a cynic masquerading as a realist.

Tom

> -----Original Message-----
...
> That's a perceptive remark, but I would go further... You mentioned
> **model** uncertainty. In fact, in any data analysis in which 
> we explore the
> data first to choose a model, fit the model (parametric or 
> non..), and then
> use whatever (pivots from parametric analysis; 
> bootstrapping;...) to say
> something about "model uncertainty," we're always kidding 
> ourselves and our
> colleagues because we fail to take into account the 
> considerable variability
> introduced by our initial subjective exploration and 
> subsequent choice of
> modelling strategy. One can only say (at best) that the stated model
> uncertainty is an underestimate of the true uncertainty. And 
> very likely a
> considerable underestimate because of the model choice subjectivity.
> 
> Now I in no way wish to discourage or abridge data 
> exploration; only to
> point out that we statisticians have promulgated a self-serving and
> unrealistic view of the value of formal inference in quantifying true
> scientific uncertainty when we do such exploration -- and 
> that there is
> therefore something fundamentally contradictory in our own 
> rhetoric and
> methods. Taking a larger view, I think this remark is part of 
> the deeper
> epistemological issue of characterizing what can be 
> scientifically "known"
> or, indeed, defining the difference between science and art, 
> say. My own
> view is that scientific certainty is a fruitless concept: we 
> build models
> that we benchmark against our subjective measurements (as the 
> measurements
> themselves depend on earlier scientific models) of "reality." 
> Insofar as
> data can limit or support our flights of modeling fancy, they 
> do; but in the
> end, it is neither an objective process nor one whose 
> "uncertainty" can be
> strictly quantified. In creating the illusion that 
> "statistical methods" can
> overcome these limitations, I think we have both done science 
> a disservice
> and relegated ourselves to an isolated, fringe role in 
> scientific inquiry.
> 
> Needless to say, opposing viewpoints to such iconclastic remarks are
> cheerfully welcomed.
> 
> Best regards,
> 
> Bert Gunter
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>