[R] QQ plot

Michael Dewey ||@t@ @end|ng |rom dewey@myzen@co@uk
Wed Nov 13 12:46:58 CET 2019


Dear Ana

As others have commented this is getting a bit off-topic but here are 
some hints.

It is helpful to distinguish  two sorts of plot: archival plots and 
impact plots. If you want to have an impact plot which gives you a 
picture but possibly at the cost of completeness and accuracy then why not:

1 - plot a sample of your 5 million drawn at random
2 - bin the data and plot median p-value against median expected
3 - deal with overlap by choosing a graphical device which supports 
transparency and plot points in very light grey so the overlap is more 
visible.

Michael

On 12/11/2019 22:04, Ana Marija wrote:
> why I selected only those with P<0.003 to put on QQ plot is because
> the original data set contains 5556249 points and when I extract only
> P<0.001 I am getting 3713 points. Is there is a way to plot the whole
> data set, or choose only the representative points?
> 
> On Tue, Nov 12, 2019 at 3:42 PM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>>
>> the smallest p value in my dataset goes to 9.89e-08. How do I make
>> that known on the new QQ plot with multiplied with 1000 values
>>
>> On Tue, Nov 12, 2019 at 3:37 PM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>>>
>>> Just do I need to change the axis when I multiply with 1000 and what
>>> should I put on my axis?
>>>
>>> On Tue, Nov 12, 2019 at 3:07 PM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>>>>
>>>> Hi Duncan,
>>>>
>>>> yes I choose for QQ plot only P<1e-3 and multiplying everything with
>>>> 1000 works great!
>>>> This should not in my understanding influence the interpretation of
>>>> the plot, it is only changing the scale of axis.
>>>>
>>>> Thank you so much,
>>>> Ana
>>>>
>>>> On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>>>>>
>>>>> On 12/11/2019 2:56 p.m., Jim Lemon wrote:
>>>>>> I thought about this and did a little study of GWAS and the use of
>>>>>> p-values to assess significant associations. As Ana's plot begins at
>>>>>> values of about 0.001, this seems to imply that almost everything in
>>>>>> the genome is associated to some degree. One expects that most SNPs
>>>>>> will not be associated with a particular condition (p~1), so perhaps
>>>>>> something is going wrong in the calculations that produce the
>>>>>> p-values.
>>>>>
>>>>> I may be misunderstanding your last sentence, but if there is no
>>>>> association, the p-value would usually have a uniform distribution from
>>>>> 0 to 1, it wouldn't be near 1.
>>>>>
>>>>> I'd guess we're not seeing the p values from every test, only those that
>>>>> are less than 0.001.  If that's true, and there are no effects, it makes
>>>>> sense to multiply all of them by 1000 to get U(0,1) values.  On the
>>>>> plot, that would correspond to subtracting 3 from -log10(p), or adding 3
>>>>> to the reference line, as Ana requested.
>>>>>
>>>>> Or just multiply them by 1000 and pass them to qq():
>>>>>
>>>>>       qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values")
>>>>>
>>>>> As far as I can see, there's no way to tell qqman::qq to move the
>>>>> reference line.
>>>>>
>>>>> Duncan Murdoch
>>>>>
>>>>>>
>>>>>> Jim
>>>>>>
>>>>>> On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
>>>>>> <malone using malonequantitative.com> wrote:
>>>>>>>
>>>>>>> I agree with Abby. That would defeat the purpose of a QQ plot.
>>>>>>>
>>>>>>> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle <spurdle.a using gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> I'm not familiar with the qqman package, or GWAS studies.
>>>>>>>> However, my guess would be that you're *not* supposed to change the
>>>>>>>> position of the line.
>>>>>>>>
>>>>>>>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija <sokovic.anamarija using gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I was using this library, qqman
>>>>>>>>> https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
>>>>>>>>>
>>>>>>>>> to create QQ plot, attached. How would I change this default abline to
>>>>>>>>> start from the beginning of my QQ line?
>>>>>>>>>
>>>>>>>>> This is my code:
>>>>>>>>> qq(dd$P, main = "Q-Q plot of GWAS p-values")
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Ana
>>>>>>>>> ______________________________________________
>>>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>>> PLEASE do read the posting guide
>>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>>>
>>>>>>>> ______________________________________________
>>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>> PLEASE do read the posting guide
>>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>>>
>>>>>>>
>>>>>>>           [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Michael
http://www.dewey.myzen.co.uk/home.html



More information about the R-help mailing list