[BioC] question about how to understand exon usage coefficient value

Wed Jun 11 09:35:58 CEST 2014

Dear List,

For the record, I forward the conversation below between me and 
Jianhong, I forgot to cc the BioC mailing list in my e-mails.

Best regards,
Alejandro

-------- Original Message --------
Subject: 	Re: question about how to understand exon usage coefficient value
Date: 	Fri, 30 May 2014 20:21:18 +0200
From: 	Alejandro Reyes <alejandro.reyes at embl.de>
To: 	Ou, Jianhong <Jianhong.Ou at umassmed.edu>

Hi again Jianhong Ou,

There was a mistake in the code that was causing label
changes sometimes, it should be fixed in the most recent versions
of the package (either the release 1.10.5 or the devel 1.11.7, they
should be available via biocLite in the
next couple of days).

Thanks again for reporting this!
Best regards,
Alejandro

> Hi Jianhong Ou,
>
> Thanks a lot for sending me your data, there is an error in the DEXSeq
> code that is changing the labels
> in the plots,  I will fix it during the weekend and let you know!
>
> Best regards,
> Alejandro
>
>> Hi Alejandro,
>>
>> Thank you for your quick reply. I just renamed FOX2 to KD in last
>> email in
>> order to make the sample clear. I don't think it is the error of legend
>> mislabeling because some of the expression keep same direction. Maybe I
>> misunderstand the meaning of log2fold change.
>>
>> You can download the dxr object from
>> http://pgfe.umassmed.edu/tmpfiles/DEXSeq/dxr.fox.rds and run code:
>>
>>> load("dxr.fox.rds")
>>> library(DEXSeq)
>>> plotDEXSeq(dxr.fox, "ENSG00000171603", displayTranscripts=T,
>>> legend=TRUE,  cex.axis=1.2, cex=1.3, lwd=1)
>>> plotDEXSeq(dxr.fox, "ENSG00000171603", displayTranscripts=T,
>>> expression=FALSE, norCounts=TRUE, legend=TRUE,  cex.axis=1.2, cex=1.3,
>>> lwd=1, FDR=0.05)
>> My SessionInfo is
>>
>> R version 3.1.0 (2014-04-10)
>> Platform: x86_64-apple-darwin12.5.0 (64-bit)
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] parallel  stats     graphics  grDevices utils     datasets methods
>> base
>>
>> other attached packages:
>>   [1] DEXSeq_1.10.3           BiocParallel_0.6.0 DESeq2_1.4.5
>>    RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3
>>   [7] GenomeInfoDb_1.0.2      IRanges_1.22.6 Biobase_2.24.0
>>    BiocGenerics_0.10.0
>>
>> loaded via a namespace (and not attached):
>>   [1] annotate_1.42.0      AnnotationDbi_1.26.0 BatchJobs_1.2
>> BBmisc_1.6           biomaRt_2.20.0       Biostrings_2.32.0
>> bitops_1.0-6
>>   [8] brew_1.0-6           codetools_0.2-8      DBI_0.2-7
>> digest_0.6.4         fail_1.2             foreach_1.4.2
>> genefilter_1.46.1
>> [15] geneplotter_1.42.0   grid_3.1.0           hwriter_1.3
>> iterators_1.0.7      lattice_0.20-29      locfit_1.5-9.1 plyr_1.8.1
>>          [22] RColorBrewer_1.0-5   RCurl_1.95-4.1 Rsamtools_1.16.0
>> RSQLite_0.11.4       sendmailR_1.1-2      splines_3.1.0
>> statmod_1.4.19
>> [29] stats4_3.1.0         stringr_0.6.2        survival_2.37-7
>> tools_3.1.0          XML_3.98-1.1         xtable_1.7-3
>> XVector_0.4.0
>> [36] zlibbioc_1.10.0
>>
>> Thanks a lot for your help. I deeply appreciated for that. And
>> Looking for
>> your reply.
>>
>>
>> Yours Sincerely,
>>
>> Jianhong Ou
>>
>> LRB 670A
>> Program in Gene Function and Expression
>> 364 Plantation Street Worcester,
>> MA 01605
>>
>>
>>
>>
>> On 5/30/14 3:43 AM, "Alejandro Reyes" <alejandro.reyes at embl.de> wrote:
>>
>>> Hi Jianhong Ou,
>>>
>>> Thanks for your interest in DEXSeq!
>>> It seems to be an error of legend mislabeling of the plots that I
>>> thought I fixed before... I don't know if its from the code or from
>>> your
>>> side.
>>> The plots that you send attached doesn't seem to correspond with the
>>> code that you send, since the plots have the FOX2 label instead of the
>>> KD label.  Could you update the plots and verify if this is the
>>> problem?
>>> If not, could you send me your dxr object and a reproducible example of
>>> how are you generating those plots so I could have a closer look at
>>> what
>>> is happening?
>>>
>>> Best regards,
>>> Alejandro
>>>
>>>
>>>
>>>> Hi Alejandro,
>>>>
>>>> I am using DEXSeq to analysis alternative splicing events of my
>>>> knockdown samples. DEXSeq is very easy to use. However, I have some
>>>> trouble in understanding the results. The question is why some of the
>>>> log2fold change of the KD vs WT is opposite with the raw counts. For
>>>> example the E012 in the following sample (attached please find the
>>>> figures of expression and normalized counts):
>>>>
>>>>                               groupID featureID exonBaseMean
>>>> dispersion         stat       pvalue         padj        KD
>>>> NS log2fold_KD_NS
>>>> ENSG00000171603:E012 ENSG00000171603      E012       255.50
>>>> 0.0009438170 2.347679e+02 5.439942e-53 7.353639e-50 36.9088539
>>>>   20.20013887      0.869601727
>>>>                       genomicData.seqnames genomicData.start
>>>> genomicData.end genomicData.width genomicData.strand countData.NSrep1
>>>> countData.NSrep2
>>>> ENSG00000171603:E012                 chr1           9797556
>>>> 9797612                57                  - 377              372
>>>>                       countData.KDrep1 countData.KDrep2 transcripts
>>>> ENSG00000171603:E012                147                126
>>>> ENST0000....
>>>>
>>>>
>>>> Thank you for your help.
>>>>
>>>> The codes I used is ,
>>>>> countFiles
>>>> [1] "NS-1.nodenovo.counts"   "NS-2.nodenovo.counts"
>>>> "KD-1.nodenovo.counts" "KD-2.nodenovo.counts"
>>>>> sampleTable <- data.frame(row.names=c("NSrep1", "NSrep2", "KDrep1",
>>>> "KDrep2"), condition=c("NS", "NS", "KD", "KD"))
>>>>> sampleTable
>>>>           condition
>>>> NSrep1          NS
>>>> NSrep2          NS
>>>> KDrep1          KD
>>>> KDrep2          KD
>>>>> dxd <- DEXSeqDataSetFromHTSeq(countFiles, sampleData=sampleTable,
>>>> design=~sample+exon+condition:exon, flattenedfile=gffFile)
>>>>> dxr <- DEXSeq(dxd)
>>>> The sessionInfo is,
>>>> R version 3.1.0 (2014-04-10)
>>>> Platform: x86_64-apple-darwin12.5.0 (64-bit)
>>>>
>>>> locale:
>>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>>
>>>> attached base packages:
>>>> [1] grid      parallel  stats     graphics  grDevices utils
>>>> datasets  methods   base
>>>>
>>>> other attached packages:
>>>>   [1] biomaRt_2.20.0          Vennerable_3.0  xtable_1.7-3
>>>>   gtools_3.4.0            reshape_0.8.5 RColorBrewer_1.0-5
>>>>   [7] lattice_0.20-29         RBGL_1.40.0 graph_1.42.0
>>>>   DEXSeq_1.10.3 BiocParallel_0.6.0      DESeq2_1.4.5
>>>> [13] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3
>>>>   GenomeInfoDb_1.0.2      IRanges_1.22.6 Biobase_2.24.0
>>>> [19] BiocGenerics_0.10.0
>>>>
>>>> loaded via a namespace (and not attached):
>>>>   [1] annotate_1.42.0      AnnotationDbi_1.26.0 BatchJobs_1.2
>>>>   BBmisc_1.6           Biostrings_2.32.0 bitops_1.0-6
>>>> brew_1.0-6
>>>>   [8] codetools_0.2-8      DBI_0.2-7            digest_0.6.4
>>>> fail_1.2             foreach_1.4.2  genefilter_1.46.1
>>>>   geneplotter_1.42.0
>>>> [15] hwriter_1.3          iterators_1.0.7  locfit_1.5-9.1
>>>> plyr_1.8.1           RCurl_1.95-4.1   Rsamtools_1.16.0
>>>> RSQLite_0.11.4
>>>> [22] sendmailR_1.1-2      splines_3.1.0  statmod_1.4.19
>>>> stats4_3.1.0         stringr_0.6.2  survival_2.37-7 tools_3.1.0
>>>> [29] XML_3.98-1.1         XVector_0.4.0  zlibbioc_1.10.0
>>>>
>>>>
>>>> Yours Sincerely,
>>>>
>>>> Jianhong Ou
>>>>
>>>> LRB 670A
>>>> Program in Gene Function and Expression
>>>> 364 Plantation Street Worcester,
>>>> MA 01605
>