[R] plot - central limit theorem

Greg Snow Greg.Snow at imail.org
Thu Oct 16 17:43:47 CEST 2008


I wonder if including the p-values for the normality test is the best approach in you animation?  The clt does not say that the distribution of the means will be normal, just that it approaches normality (and therefore may be a decent approximation).  The normality test can just reject the null that the data (simulated means) comes from a normal distribution.  Since the true distribution of the means is not normal (unless you use a sample size of Inf, and I for one have better things to than wait for a computer to simulate several samples of size Inf) the null for the normality test is always false and therefore the test will always result in either saying it is not normal or a type II error.  The real goal is not to show normality, but to show that using the normal gives a "good enough" approximation.  I would prefer the bottom plot to show either the proportion of p-values from a normal based test on the simulated data that is less than alpha, or the proportion of confidence intervals based on the normal based test that include the true parameter.  Then the user can see when those values become close enough an approximation.

What is your target audience for this demo?  In my opinion, anyone who could understand the bottom plot should already understand the clt enough not to need the demo, those that I would aim the demo at would just be confused by the current bottom plot.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Yihui Xie
> Sent: Wednesday, October 15, 2008 10:51 PM
> To: roger koenker
> Cc: r-help
> Subject: Re: [R] plot - central limit theorem
>
> Thanks, Roger, your demo is interesting. I'm thinking about improving
> it later.
>
> I've also made a demo for the CLT in my package 'animation', in which
> there's also normality testing for the sample means, because I don't
> think "bell-shaped" alone means normality - so I performed the
> Shapiro-Wilk test and plotted the P-values under the demo. See the
> function clt.ani() in the package 'animation', or
> http://animation.yihui.name/prob:central_limit_theorem
>
> You can use any function to denote the population (specify the
> argument 'FUN') in clt.ani().
>
> Regards,
> Yihui
> --
> Yihui Xie <xieyihui at gmail.com>
> Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086
> Mobile: +86-15810805877
> Homepage: http://www.yihui.name
> School of Statistics, Room 1037, Mingde Main Building,
> Renmin University of China, Beijing, 100872, China
>
>
>
> On Thu, Oct 16, 2008 at 4:22 AM, roger koenker <rkoenker at uiuc.edu>
> wrote:
> > Galton's 19th century mechanical version of this is the quincunx.  I
> have a
> > (very primitive) version of this for R at:
> >
> >
> http://www.econ.uiuc.edu/~roger/courses/476/routines/quincunx.R
> >
> >
> > url:    www.econ.uiuc.edu/~roger            Roger Koenker
> > email    rkoenker at uiuc.edu            Department of Economics
> > vox:     217-333-4558                University of Illinois
> > fax:       217-244-6678                Champaign, IL 61820
> >
> >
> >
> >> Jörg Groß wrote:
> >>>
> >>> Hi,
> >>>
> >>>
> >>> Is there a way to simulate a population with R and pull out m
> samples,
> >>> each with n values
> >>> for calculating m means?
> >>>
> >>> I need that kind of data to plot a graphic, demonstrating the
> central
> >>> limit theorem
> >>> and I don't know how to begin.
> >>>
> >>> So, perhaps someone can give me some tips and hints how to start
> and
> >>> which functions to use.
> >>>
> >>>
> >>>
> >>> thanks for any help,
> >>> joerg
> >>>
> >
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list