[R] how to remove outliers

S Ellison S.Ellison at LGCGroup.com
Tue Jul 15 18:04:56 CEST 2014


The outlier is a maximum value, so which.max() is likely to be useful 

time <- time[-which.max(time$TimeDiff ),]

should work reliably.


S Ellison

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Bert Gunter
> Sent: 15 July 2014 11:38
> To: Hasan Diwan
> Cc: R Project Help
> Subject: Re: [R] how to remove outliers
> 
> No! Do not do this.
> 
> First, the syntax is wrong. Second, this will fail in general due to floating point
> arithmetic. Use inequality with sufficient fuzz instead.
> 
> e.g.
> time <- time[time$TimeDiff < 14478,]
> 
> Moral: Caveat Emptor. Free advice may be worth exactly that.
> 
> Cheers,
> Bert
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
> 
> "Data is not information. Information is not knowledge. And knowledge is
> certainly not wisdom."
> Clifford Stoll
> 
> 
> 
> 
> On Mon, Jul 14, 2014 at 9:43 PM, Hasan Diwan <hasan.diwan at gmail.com>
> wrote:
> > Marta,
> > To remove a row from your data frame, use:
> >
> > value <- 14478.4
> > time <- time[-time[$TimeDiff] == value,]
> >
> > I hope that helps... If not, do push back. -- H
> >
> >
> > On 14 July 2014 09:17, Marta valdes lopez <martavaldes85 at gmail.com>
> wrote:
> >
> >> Hi!
> >>
> >> I did this test and I got this outlier that i would like to remove
> >> the whole row in my database; anyone knows how i can remove it?
> >>
> >>  chisq.out.test(time$TimeDiff)
> >>         chi-squared test for outlier
> >> data:  time$TimeDiff
> >> X-squared = 73260.07, p-value < 2.2e-16 alternative hypothesis:
> >> highest value 14478.4 is an outlier
> >>
> >> Thank you!!
> >>
> >>         [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> > --
> > Sent from my mobile device
> > Envoyé de mon portable
> >
> >         [[alternative HTML version deleted]]
> >
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


*******************************************************************
This email and any attachments are confidential. Any use, copying or
disclosure other than by the intended recipient is unauthorised. If 
you have received this message in error, please notify the sender 
immediately via +44(0)20 8943 7000 or notify postmaster at lgcgroup.com 
and delete this message and any copies from your computer and network. 
LGC Limited. Registered in England 2991879. 
Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK


More information about the R-help mailing list