[BioC] EdgeR estimateTagwiseDisp()

James W. MacDonald jmacdon at uw.edu
Fri Dec 14 16:12:03 CET 2012


Hi Jetse,

On 12/14/2012 8:21 AM, Jetse [guest] wrote:
> I want to use edgeR to detect differential expression. For this I first read the bam file with this function:
>
> getCounts<- function(alignmentName, tx){
>    fileName<- paste("/data/WntData/tophat/",alignmentName,".sorted.bam", sep="")
>    alignment<- readBamGappedAlignments(fileName)
>    newReadNames<- gsub("([0-9(MT|X|Y)])","chr\\1",rname(alignment))
>    alignment<- GRanges(seqnames=newReadNames,ranges=IRanges(start=start(alignment),end=end(alignment)), strand=strand(alignment))
>    alignmentCounts<- suppressWarnings(countOverlaps(tx,alignment))
> }

You might want to re-think this function, as I don't think 
countOverlaps() is really what you want here. For more background, see

http://www-huber.embl.de/users/anders/HTSeq/doc/count.html

and note that countOverlaps() ignores these issues, whereas 
summarizeOverlaps() does not. I think you would be better off doing 
something like

library(GenomicFeatures)
library(Rsamtools)

fls <- paste("/data/WntData/tophat/", dir("/data/WntData/tophat/", 
"sorted.bam$"), sep = "")
bfl <- BamFileList(fls)
olaps <- summarizeOverlaps(tx, bfl) ## here I assume your tx object is a 
Transcript.db

Then you can go on from here using

y <- DGEList(counts = assays(olaps)$counts, groups = groups)

And I would also recommend

library(edgeR)
edgeRUsersGuide()

as a more reliable source than some possibly dated tutorial that you 
found on the web.

Best,

Jim


>
> Then I create a table of raw counts by using this command:
> rawCountTable<- data.frame(polyPlus=polyPlusCounts, polyMin=polyMinCounts)
>
> Then I follow the tutorial from: http://cgrlucb.wikispaces.com/edgeR+Tutorial
> So to build the edgeR object, I have this code:
> y<- DGEList(counts=rawCountTable, group=groups)
> y<- calcNormFactors(y)
> y<- estimateCommonDisp(y)
> y<- estimateTagwiseDisp(y)
>
> When executing this last function, I get this error:
> Error in t.default(object$counts) : argument is not a matrix
>
> When I use check the object$counts with class(y$counts), this is a matrix!
> What am I doing wrong now?
>
> On google I only found people with old versions, who didn't use the estimateCommonDisp function...
>
> I hope someone can help me with this question.
>
>
>   -- output of sessionInfo():
>
>> sessionInfo()
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-suse-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] grid      stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
>   [1] edgeR_3.0.6            limma_3.14.3           VennDiagram_1.5.1
>   [4] RMySQL_0.9-3           Rsamtools_1.10.2       Biostrings_2.26.2
>   [7] GenomicFeatures_1.10.1 AnnotationDbi_1.20.3   pasilla_0.2.14
> [10] DESeq_1.10.1           lattice_0.20-6         locfit_1.5-8
> [13] DEXSeq_1.4.0           Biobase_2.18.0         BiocInstaller_1.8.3
> [16] cummeRbund_2.0.0       Gviz_1.2.1             rtracklayer_1.18.1
> [19] GenomicRanges_1.10.5   IRanges_1.16.4         fastcluster_1.1.7
> [22] reshape2_1.2.2         ggplot2_0.9.3          RSQLite_0.11.2
> [25] DBI_0.2-5              BiocGenerics_0.4.0
>
> loaded via a namespace (and not attached):
>   [1] annotate_1.36.0    biomaRt_2.14.0     biovizBase_1.6.0   bitops_1.0-5
>   [5] BSgenome_1.26.1    cluster_1.14.2     colorspace_1.2-0   dichromat_1.2-4
>   [9] digest_0.6.0       genefilter_1.40.0  geneplotter_1.36.0 gtable_0.1.2
> [13] Hmisc_3.10-1       hwriter_1.3        labeling_0.1       MASS_7.3-18
> [17] memoise_0.1        munsell_0.4        parallel_2.15.1    plyr_1.8
> [21] proto_0.3-9.2      RColorBrewer_1.0-5 RCurl_1.95-3       scales_0.2.3
> [25] splines_2.15.1     statmod_1.4.16     stats4_2.15.1      stringr_0.6.2
> [29] survival_2.36-14   tcltk_2.15.1       tools_2.15.1       XML_3.95-0.1
> [33] xtable_1.7-0       zlibbioc_1.4.0
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list