[R] Removing Outliers Function

Carl Witthoft carl at witthoft.com
Wed Feb 9 23:31:49 CET 2011


To answer part 2:  You should read up on statistical distributions and 
when a sample size is (or isn't) large enough to produce reliable 
statistical parameters such as mean or variance.   I suspect David was 
implying that your yardstick, based on studentized residual,  removes 
valid samples.

I once wrote a simple bit of code (back when I had to do things in c 
rather than R :-(  ) that removed data points that were more than 
N*sigma off the current fitted data set, where N was 3 or 4.  Even that 
is sloppy, as it doesn't take the sample size or other fit parameters 
into account, but it's a lot easier than your setup.


Carl


<quote>
From: kirtau <kirtau_at_live.com>
Date: Wed, 09 Feb 2011 10:06:07 -0800 (PST)

I have two questions,

    1. if the solutions is only three or four lines of code is there 
anyway you can share those lines, without disrespecting me further
    2. Can you explain why you feel that this is "statistical malpractice"
</quote>



More information about the R-help mailing list