[R] Physical or Statistical Explanation for the "Funnel" Plot?

Mike Miller mbmiller+l at gmail.com
Fri Mar 27 05:34:55 CET 2009


On Thu, 26 Mar 2009, Jason Rupert wrote:

> The R code below produces (after running for a few minutes on a decent 
> computer) the plot shown at the following location:
>
> http://n2.nabble.com/Is-there-a-physical-and-quantitative-explanation-for-this-plot--td2542321.html
>
> I'm just taking the mean of a given set of random variables, where the 
> set size is increased.  There appears to be a quick convergence and then 
> a pretty steady variance out to a set size of 10,0000.


I don't have time to study your code, but it sounds like you are taking 
random normal variables with mean 0 and variance 1, but then taking the 
mean for sets of those.  We know exactly the distribution for the mean of 
the "set" (a.k.a., "sample").  The mean has a normal distribution with 
mean 0 and variance 1/N where N is the size of the sample.  When you allow 
N to vary, you produce a mixture of random normal variables all having 
mean 0 but with different variances.  The plot you show looks correct -- 
the distributions in the mixture that have small variance pile up in the 
middle, while those with greater variance form the long tails.  You could 
get a lot of different shapes depending ont he distribution of N.  But 
save yourself some time.  Instead of making N normal variables and taking 
the mean, just make one and divide it by sqrt(N) -- that will give you 
*exactly* the same result.

Your graph looks a little weird - first, why turn it sideways?  We 
normally plot density on the ordinate, not on the abscissa.  Second, there 
is a thick black bar on the left, but that seems to be an artifact because 
at least half of it is below zero -- how can that happen?

Mike




More information about the R-help mailing list