[R] QQ plotting of various distributions...

Eric Thompson ericthompso at gmail.com
Sun Sep 27 23:00:12 CEST 2009


It seems I misunderstood Sunil's response and somewhat freaked out
because it appeared that he was giving the wrong method for making a
QQ plot, but was actually demonstrating the sampling variability. My
apologies to Sunil.



2009/9/27 Duncan Murdoch <murdoch at stats.uwo.ca>:
> Eric Thompson wrote:
>>
>> The supposed example of a Q-Q plot is most certainly not how to make a
>> Q-Q plot. I don't even know where to start....
>>
>> First off, the two "Q:s in the title of the plot stand for "quantile",
>> not "random". The "answer" supplied simply plots two sorted samples of
>> a distribution against each other. While this may resemble the general
>> shape of a QQ plot, that is where the similarities end.
>>
>
> The empirical quantiles of a sample are simply the sorted values.  You can
> plot empirical quantiles of one sample versus some version of quantiles from
> a distribution (what qqnorm does) or versus empirical quantiles of another
> sample (what Sunil did).  The randomness in his demonstration did two
> things: it generated some data, and it showed the variability of the plot
> under repeated sampling.
>>
>> Some general advice: be careful who you take advice from on the
>> internet.
>
> That's good advice.
>
> Duncan Murdoch
>
>> The Wikipedia entry for Q-Q plot may be a good start if you
>> don't know what a Q-Q plot is, although you should also use it with
>> caution.
>>
>> Lets say you have some samples that may be normally distributed:
>>
>> set.seed(1)
>> x <- rnorm(30)
>>
>> # now try with R's built in function
>> qqnorm(x, xlim = c(-3, 3), ylim = c(-3, 3))
>>
>> # Now try Sunil's "Q-Q plot" method, but for rnorm
>> # rather than rgamma
>> some_data <- x
>> test_data <- rnorm(30)
>> points(sort(some_data),sort(test_data), col = "blue")
>>
>> # Note that the points are NOT the same!
>>
>> This should have been obvious for the simple reason that the QQ plot
>> should not be influenced by the random number generator that you are
>> using! A QQ plot is uniquely reproducible. The more general (and
>> correct) way to get the QQ plot involves choosing a plotting position
>> and the quantile function (e.g. qnorm or qgamma functions in R) of the
>> pertinent distribution:
>>
>> # Sort the data:
>> x.s <- sort(x)
>> n <- length(x)
>>
>> # Plotting position (must be careful here in general!)
>> p <- ppoints(n)
>>
>> # Compute the quantile
>> x.q <- qnorm(p)
>>
>> points(x.q, x.s, col = "red")
>>
>> # and they fall exactly on the points generated by qqnorm().
>>
>> Now, you should be able to generalize this for any distribution. Hope
>> this helps.
>>
>>
>> Eric Thompson
>>
>>
>>
>>
>> 2009/9/27 Petar Milin <pmilin at ff.uns.ac.rs>:
>>
>>>
>>> Thanks for the answer. Now, only problem is to to get parameter(s) of a
>>> given function. For gamma, I shall try with gammafit() from mhsmm
>>> package.
>>> Also, I shall look for others appropriate parameter estimates. Will use
>>> SuppDists too.
>>>
>>> Best,
>>> PM
>>>
>>> Sunil Suchindran wrote:
>>>
>>>>
>>>> #same shape
>>>>
>>>> some_data <- rgamma(500,shape=6,scale=2)
>>>> test_data <- rgamma(500,shape=6,scale=2)
>>>> plot(sort(some_data),sort(test_data))
>>>> # You can also use qqplot(some_data,test_data)
>>>> abline(0,1)
>>>>
>>>> # different shape
>>>>
>>>> some_data <- rgamma(500,shape=6,scale=2)
>>>> test_data <- rgamma(500,shape=4,scale=2)
>>>> plot(sort(some_data),sort(test_data))
>>>> abline(0,1)
>>>>
>>>> It is helpful to assess the sampling variability, by
>>>> creating repeated sets of test_data, and plotting
>>>> all of these along with your observations to create
>>>> a confidence "envelope".
>>>>
>>>> The SuppDists provides Inverse Gauss.
>>>>
>>>>
>>>> On Thu, Sep 17, 2009 at 11:46 AM, Petar Milin <pmilin at ff.uns.ac.rs>
>>>> wrote:
>>>>
>>>>   Hello!
>>>>   I am trying with this question again:
>>>>   I would like to test few distributional assumptions for some
>>>>   behavioral response data. There are few theories about true
>>>>   distribution of those data, like: normal, lognormal, gamma,
>>>>   ex-Gaussian (exponential-Gaussian), Wald (inverse Gaussian) etc. The
>>>>   best way would be via qq-plot, to show to students differences.
>>>>   First two are trivial:
>>>>   qqnorm(dat$X)
>>>>   qqnorm(log(dat$X))
>>>>   Then, things are getting more "hairy". I am not sure how to make
>>>>   plots for the rest. I tried gamma with:
>>>>   qqmath(~ X, data=dat, distribution=function(X)
>>>>   � qgamma(X, shape, scale))
>>>>   Which should be the same as:
>>>>   plot(qgamma(ppoints(dat$X), shape, scale), sort(dat$X))
>>>>   Shape and scale parameters I got via mhsmm package that has
>>>>   gammafit() for shape and scale parameters estimation.
>>>>   Am I on right track? Does anyone know how to plot the rest:
>>>>   ex-Gaussian (exponential-Gaussian), Wald (inverse Gaussian)?
>>>>
>>>>   Thanks,
>>>>   PM
>>>>
>>>>   ______________________________________________
>>>>   R-help at r-project.org <mailto:R-help at r-project.org> mailing list
>>>>   https://stat.ethz.ch/mailman/listinfo/r-help
>>>>   PLEASE do read the posting guide
>>>>   http://www.R-project.org/posting-guide.html
>>>>   <http://www.r-project.org/posting-guide.html>
>>>>   and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>




More information about the R-help mailing list