[BioC] plotpfm has been removed in PWMEnrich?

Robert Stojnic rainmansr at gmail.com
Wed Dec 11 23:22:00 CET 2013


Dear Fabrice,

pm <- PFMtoPWM(pfm)
plot(pm)

will plot the logo with correct heights (it will use the calculation 
I've described below). More advanced options such as changing units 
(i.e. from bits to e.g. nats) are not supported. If you want to create a 
custom plot you will have to do it from scratch. Look at the seqLogo 
package source, or a similar function PWMEnrich:::seqLogoGrid in PWMEnrich.

Cheers, Robert

On 11/12/13 21:01, Fabrice Tourre wrote:
> pm <- PFMtoPWM(pfm)
> pm at pwm <- heights
> plot(pm)
>
>
> This plot is not good. Because we cannot control y-axis
>
> On Wed, Dec 11, 2013 at 3:44 PM, Fabrice Tourre <fabrice.ciup at gmail.com> wrote:
>> Dear Robert,
>>
>> Thank you. When I have the heights matrixs, how can we get a weblogo plot?
>>
>> On Wed, Dec 11, 2013 at 5:44 AM, Robert Stojnic <rainmansr at gmail.com> wrote:
>>> Hi Fabrice,
>>>
>>> The negative values are usual as the PWM matrix is defined as log2
>>> probability_base / probability_background. So if the any of the
>>> probabilities is smaller than 0.25 in your input matrix, this will give a
>>> negative value. Maybe you wanted to get the heights of the individual
>>> letters in the logo?
>>>
>>> # convert the count matrix to probabilities
>>> p = t(t(pfm) / colSums(pfm))
>>>
>>> # this is without the small-sample correction, for more information see
>>> https://en.wikipedia.org/wiki/Sequence_logo or the original paper
>>> h = - colSums(p * log2(p))
>>> heights = t(t(p) * (2-h))
>>>
>>> All the transpositions (calls to t()) are needed to make sure the matrix and
>>> vector operations are carried out along the correct dimensions.
>>>
>>> Cheers, Robert
>>>
>>>
>>> On 10/12/13 16:41, Fabrice Tourre wrote:
>>>> PWM always give me some negative value. This strange to me.
>>>>
>>>> PFMtoPWM(pfm)
>>>> An object of class 'PWM'
>>>> ID:
>>>> Target name:
>>>> Frequency matrix:
>>>> $pfm
>>>>        [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]    [,8]    [,9]
>>>> A 3289112 3286601 3324808 3295086 3297953 3371158 3350851 3380593 3406893
>>>> C 2770671 2771901 2749093 2791942 2791869 2756225 2779305 2780189 2727296
>>>> G 2836085 2853221 2826205 2830413 2845211 2809652 2778112 2742323 2542202
>>>> T 3286234 3270379 3281996 3264661 3247069 3245067 3273834 3278997 3505711
>>>>       [,10]   [,11]   [,12]   [,13]   [,14]   [,15]   [,16]   [,17]   [,18]
>>>> A 3293358 2178674 4352405 3227324 3461232 3446487 3259436 3276060 3271729
>>>> C 2663331 2932504 2139363 3052255 2674074 2550187 3005584 2805702 2844711
>>>> G 2467603 2413635 1644787 2028507 2906055 2827872 2948758 3050624 2971989
>>>> T 3757810 4657289 4045547 3874016 3140741 3357556 2968324 3049716 3093673
>>>>       [,19]   [,20]   [,21]
>>>> A 3297076 3293868 3334010
>>>> C 2853311 2824444 2768038
>>>> G 2891960 3017416 2974502
>>>> T 3139755 3046374 3105552
>>>> Position weight matrix (PWM):
>>>> $pwm
>>>>           [,1]        [,2]       [,3]       [,4]        [,5]        [,6]
>>>> A  0.1110069  0.10990513  0.1265798  0.1136249  0.11487965  0.14655305
>>>> C -0.1364558 -0.13581544 -0.1477355 -0.1254222 -0.12545992 -0.14399751
>>>> G -0.1027904 -0.09409968 -0.1078251 -0.1056786 -0.09815553 -0.11629972
>>>> T  0.1097440  0.10276665  0.1078823  0.1002420  0.09244685  0.09155707
>>>>           [,7]       [,8]       [,9]      [,10]      [,11]      [,12]
>>>> A  0.1378363  0.1505851  0.1617654  0.1128682 -0.4832408  0.5151216
>>>> C -0.1319670 -0.1315082 -0.1592199 -0.1934594 -0.0545581 -0.5095098
>>>> G -0.1325864 -0.1512927 -0.2606125 -0.3035809 -0.3354836 -0.8887903
>>>> T  0.1042900  0.1065634  0.2030159  0.3032009  0.6127992  0.4096436
>>>>            [,13]       [,14]      [,15]       [,16]        [,17]
>>>> [,18]
>>>> A  0.083647231  0.18459445  0.1784354  0.09793116  0.105270587  0.10336206
>>>> C  0.003184313 -0.18765178 -0.2560881 -0.01904584 -0.118329389 -0.09840908
>>>> G -0.586272844 -0.06762917 -0.1069744 -0.04658375  0.002413189 -0.03526240
>>>> T  0.347138703  0.04441379  0.1407203 -0.03704261  0.001983716  0.02262953
>>>>           [,19]         [,20]       [,21]
>>>> A  0.11449595  0.1130915480  0.13056724
>>>> C -0.09405417 -0.1087242780 -0.13782742
>>>> G -0.07464358 -0.0133775737 -0.03404303
>>>> T  0.04396080  0.0004018867  0.02815854
>>>> With background nucleotide frequencies which also serve as pseudo-count:
>>>> $prior.params
>>>>      A    C    G    T
>>>> 0.25 0.25 0.25 0.25
>>>>
>>>> On Tue, Dec 10, 2013 at 7:29 AM, Robert Stojnic <rainmansr at gmail.com>
>>>> wrote:
>>>>> Dear Fabrice,
>>>>>
>>>>> You can plot the PFM in PWMEnrich by converting it to PWM:
>>>>>
>>>>> plot(PFMtoPWM(pfm_matrix))
>>>>>
>>>>> Unfortunately the old function was deprecated. Sorry if it broke your
>>>>> code!
>>>>>
>>>>> Cheers, Robert
>>>>>
>>>>>
>>>>> On 07/12/13 04:41, Fabrice Tourre wrote:
>>>>>> Dear expert,
>>>>>>
>>>>>> I just want to plot PFM using PWMEnrich package. It seems seqlogo
>>>>>> cannot do this. But I found in PWMEnrich2.6.2, plotpfm does not exist.
>>>>>> How can I plotPFM using new PWMEnrich?
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioconductor mailing list
>>>>>> Bioconductor at r-project.org
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>> Search the archives:
>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>>



More information about the Bioconductor mailing list