[R] mild and extreme outliers in boxplot

Rnewbie xuancj at yahoo.com
Thu Aug 20 01:27:01 CEST 2009


I read the boxplot() help file and googled before making the post, and with
my little knowledge on R I was not able to plot in the way I wanted. That’s
why I made the post. Whether I can eventually solve the problem or not, I
appreciate very much any help.

I’m a very beginner of R, and found the R help forum a couple of weeks ago.
Since I thought I’m not among the major players of the forum and that the
post per se rather than poster is of concern like in any other public online
forum, I just registered with an arbitrarily chosen ID and kept using it. I
hope I haven't violated any rules because of this. I’m not making use of R
help for any commercial purposes whatsoever. I’m a master’s student working
on my thesis.

Thanks all for your help.

Jimmy



Gavin Simpson wrote:
> 
> On Wed, 2009-08-19 at 13:49 -0700, Bert Gunter wrote:
>> Rolf:
>> 
>> Not sure what "reasonably thorough" means but:
>> 
>>  ? boxplot says:
> 
> Exactly Bert, the info is there is you want to look and do so hard
> enough. However, it is perhaps expecting quite a lot of a new useR to
> put this together from ?boxplot or ?bxp, and ?boxplot.stats.
> 
> Criticising correct, if cryptic or highlevel, responses to a list where
> people give their time for free, *and* not provide a more complete
> solution is unfair, Rolf. The OP is free to respond and ask for
> additional help once they've given it a go if they are still having
> trouble..
> 
> One solution, if you are prepared to bastardise the standard
> interpretation of the boxplot, is to compute the relevant boxplot
> statistics using boxplot.stats and alter argument 'coef' to some larger
> multiple of the box height to represent "extreme" outliers, whatever
> those might be. So here's the rope, try not to hang yourself 'Rnewbie'!
> 
> set.seed(1234)
> dat <- rt(100, df = 2)
> bxp1 <- boxplot.stats(dat)
> bxp2 <- boxplot.stats(dat, coef = 2)
> 
> ##Then you'd need to plot the boxplot without outliers
> 
> boxplot(dat, outpch = NA)
> 
> ##Then plot the points 1.5-2 x box height
> 
> want <- bxp1$out %in% bxp2$out
> out <- bxp1$out
> out[want] <- NA
> 
> points(rep(1, length(out)), out, pch = 1, col = "blue")
> 
> ##Then the further outliers
> 
> outout <- bxp2$out
> points(rep(1, length(outout)), outout, pch = 2, col = "red")
> 
> How one decides what is an outlier or an extreme outlier is another
> matter...? By chance the dummy data here shows one problem; there isn't
> much difference between 'outliers' and 'extreme outliers' towards the
> bottom of the resulting plot so why should we distinguish them?
> 
> (By the way 'Rnewbie', this isn't something I recommend you do, but you
> might know more about your real world use case than I.)
> 
> HTH
> 
> G
> 
> Ps; is there a reason why you post anonymously, 'Rnewbie'? Do you not
> want us to know who you are, but want our help?
> 
>> 
>> ...
>> pars    a list of (potentially many) more graphical parameters, e.g.,
>> boxwex
>> or outpch; these are passed to bxp (if plot is true); for details, see
>> there.
>> 
>> 
>> Well, that seems pretty clear to me, so I went to ?bxp to find in the
>> pars
>> listing:
>> 
>> outlty, outlwd, outpch, outcex, outcol, outbg:
>> outlier line type, line width, point character, point size expansion,
>> color,
>> and background color. The default outlty= "blank" suppresses the lines
>> and
>> outpch=NA suppresses points.
>> 
>> 
>> It seems to me that this (and other omitted excerpts + examples) is at
>> least
>> a reasonable answer to the query (allowing the reader to at least infer
>> that
>> bxp does not distinguish degrees of outlyingness), so I don't understand
>> your criticism. Feel free to respond privately if you prefer.
>> 
>> -- Bert
>> 
>> Bert Gunter
>> Genentech Nonclinical Biostatisics
>> 
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>> On
>> Behalf Of Rolf Turner
>> Sent: Wednesday, August 19, 2009 1:27 PM
>> To: ottorino-luca.pantani at unifi.it
>> Cc: Rnewbie; ERRE
>> Subject: Re: [R] mild and extreme outliers in boxplot
>> 
>> 
>> On 20/08/2009, at 3:13 AM, Ottorino-Luca Pantani wrote:
>> 
>> > Rnewbie ha scritto:
>> >> dear all,
>> >>
>> >> could somebody tell me how I can plot mild outliers as a circle(°)  
>> >> and
>> >> extreme outliers as an asterisk(*) in a box-whisker plot?
>> >>
>> >> Thanks very much in advance
>> >>
>> > ?boxplot
>> >
>> > or
>> >
>> > help(bxp)
>> 
>> This is the sort of response that gives R-help a bad name.
>> 
>> I had a reasonably thorough look at these help files and saw  
>> ***nothing***
>> that would answer the OP's question.  The information may be there  
>> --- I'm
>> not sure about this --- but it is far from obvious.  Explicit reference
>> to the appropriate lines of the help file(s) would be useful.
>> 
>> 	cheers,
>> 
>> 		Rolf Turner
>> ######################################################################
>> Attention: 
>> This e-mail message is privileged and confidential. If you are not the 
>> intended recipient please delete the message and notify the sender. 
>> Any views or opinions presented are solely those of the author.
>> 
>> This e-mail has been scanned and cleared by MailMarshal 
>> www.marshalsoftware.com
>> ######################################################################
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> -- 
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>  Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
>  ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
>  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
>  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
>  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://www.nabble.com/mild-and-extreme-outliers-in-boxplot-tp25040545p25053573.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list