[BioC] question about how to understand exon usage coefficient value

Ou, Jianhong Jianhong.Ou at umassmed.edu
Thu May 29 21:27:19 CEST 2014

Hi Alejandro,

I am using DEXSeq to analysis alternative splicing events of my knockdown samples. DEXSeq is very easy to use. However, I have some trouble in understanding the results. The question is why some of the log2fold change of the KD vs WT is opposite with the raw counts. For example the E012 in the following sample (attached please find the figures of expression and normalized counts):

                             groupID featureID exonBaseMean   dispersion         stat       pvalue         padj        KD           NS log2fold_KD_NS
ENSG00000171603:E012 ENSG00000171603      E012       255.50 0.0009438170 2.347679e+02 5.439942e-53 7.353639e-50  36.9088539  20.20013887      0.869601727
                     genomicData.seqnames genomicData.start genomicData.end genomicData.width genomicData.strand countData.NSrep1 countData.NSrep2
ENSG00000171603:E012                 chr1           9797556         9797612                57                  -              377              372
                     countData.KDrep1 countData.KDrep2  transcripts
ENSG00000171603:E012                147                126 ENST0000....

Thank you for your help.

The codes I used is ,
> countFiles
[1] "NS-1.nodenovo.counts"   "NS-2.nodenovo.counts"   "KD-1.nodenovo.counts" "KD-2.nodenovo.counts"
> sampleTable <- data.frame(row.names=c("NSrep1", "NSrep2", "KDrep1", "KDrep2"), condition=c("NS", "NS", "KD", "KD"))
> sampleTable
NSrep1          NS
NSrep2          NS
KDrep1          KD
KDrep2          KD
> dxd <- DEXSeqDataSetFromHTSeq(countFiles, sampleData=sampleTable, design=~sample+exon+condition:exon, flattenedfile=gffFile)
> dxr <- DEXSeq(dxd)

The sessionInfo is,
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin12.5.0 (64-bit)

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] grid      parallel  stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
 [1] biomaRt_2.20.0          Vennerable_3.0          xtable_1.7-3            gtools_3.4.0            reshape_0.8.5           RColorBrewer_1.0-5
 [7] lattice_0.20-29         RBGL_1.40.0             graph_1.42.0            DEXSeq_1.10.3           BiocParallel_0.6.0      DESeq2_1.4.5
[13] RcppArmadillo_0.4.300.0 Rcpp_0.11.1             GenomicRanges_1.16.3    GenomeInfoDb_1.0.2      IRanges_1.22.6          Biobase_2.24.0
[19] BiocGenerics_0.10.0

loaded via a namespace (and not attached):
 [1] annotate_1.42.0      AnnotationDbi_1.26.0 BatchJobs_1.2        BBmisc_1.6           Biostrings_2.32.0    bitops_1.0-6         brew_1.0-6
 [8] codetools_0.2-8      DBI_0.2-7            digest_0.6.4         fail_1.2             foreach_1.4.2        genefilter_1.46.1    geneplotter_1.42.0
[15] hwriter_1.3          iterators_1.0.7      locfit_1.5-9.1       plyr_1.8.1           RCurl_1.95-4.1       Rsamtools_1.16.0     RSQLite_0.11.4
[22] sendmailR_1.1-2      splines_3.1.0        statmod_1.4.19       stats4_3.1.0         stringr_0.6.2        survival_2.37-7      tools_3.1.0
[29] XML_3.98-1.1         XVector_0.4.0        zlibbioc_1.10.0

Yours Sincerely,

Jianhong Ou

LRB 670A
Program in Gene Function and Expression
364 Plantation Street Worcester,
MA 01605
