[R] removing outlier

Bert Gunter bgunter.4567 at gmail.com
Sun Sep 13 16:33:53 CEST 2015


... and this, of course, is a nice example of how statistics
contributes to the "irreproducibility crisis" now roiling Science.

Cheers,
Bert

(Quote from a long ago engineering colleague: "Whenever I see an
outlier, I never know whether to throw it away or patent it.")


Bert Gunter

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
   -- Clifford Stoll


On Sat, Sep 12, 2015 at 9:52 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Sep 12, 2015, at 2:32 AM, Juli wrote:
>
>> Hi Jim,
>>
>> thank you for your help. :)
>>
>> My point is, that there are outlier and I don´t really know how to deal with
>> that.
>>
>> I need the dataframe for a regression and read often that only a few outlier
>> can change your results very much. In addition, regression diacnostics
>> didn´t indcate me the best results.
>> Yes, and I know its not the core of statistics to work in a way you get
>> results you would like to have ;).
>>
>> So what is your suggestion?
>>
>> And if I remove the outliers, my problem ist, that as you said, they differ
>> in length. I need the data frame for a regression, so can I remove the whole
>> column or is there a call to exclude the data?
>
> Most regression methods have a 'subset' parameter which would allow you to distort the data to your desired specification. But why not think about examining a different statistical model or using robust methods? That way you can keep all your data. (Sounds like you don't really have a lot.)
>
> --
> David.
>>
>> JULI
>>
>>
>>
>> --
>> View this message in context: http://r.789695.n4.nabble.com/removing-outlier-tp4712137p4712170.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list