[BioC] [devteam-bioc] readGappedAlignments not working in Rsamtools

Valerie Obenchain vobencha at fhcrc.org
Fri Jun 13 17:42:42 CEST 2014


Hi,

readGappedAlignments() has been replaced with readGAlignments() and is 
now in the GenomicAlignments package, not Rsamtools.

It looks like you want to count reads in a bam file against an 
annotation. summarizeOverlaps() can do this for you and offers options 
for counting reads that overlap multiple annotation features. (fyi, 
summarizeOverlaps() calls readGAlignments() or readGAlignmentPairs() 
under the hood.)

library(GenomicAlignments)
?summarizeOverlaps ## see examples on counting Bam files


Create a BamFileList of files:
bfl <- BamFileList(bam_files)

If the files are too large for available memory use a 'yieldSize':
bfl <- BamFileList(bam_files, yieldSize=100000)

We have a pre-built package for the dm3 ensembl genes. See this site for 
a list of available annotation packages:
http://www.bioconductor.org/packages/devel/BiocViews.html#___AnnotationData

Install dm3 ensembl:
library(TxDb.Dmelanogaster.UCSC.dm3.ensGene)
exbygene <- exonsBy(TxDb.Dmelanogaster.UCSC.dm3.ensGene, "gene")

Call summarizeOverlaps():
se_single <- summarizeOverlaps(exbygene, bfl)

For paired-end reads use 'singleEnd=FALSE':
se_paired <- summarizeOverlaps(exbygene, bfl, singleEnd=FALSE)


The result is a SummarizedExperiment class with results in
the 'assays' slot.
assays(se_single)


Valerie




On 06/13/2014 01:15 AM, Maintainer wrote:
> Hello everybody,
>
> I'm a PhD student and I am doing my first analysis on RNAseq data.
>
> I was trying to use the script to generate a table of counts from bam files (shown here):
>
> library(Rsamtools)
>
> setwd("C:/R_data")
>
> bam_files <- list.files(pattern="*.bam")
> gr_list <- lapply(bam_files,
>                    function(bam_file)
>                      as(readGappedAlignments(bam_file), "GRanges"))
> names(gr_list) <- bam_files
>
> library(GenomicFeatures)
> txdb=makeTranscriptDbFromUCSC(genome='dm3',tablename='ensGene')
>
> tx_by_gene=transcriptsBy(txdb,'gene')
> ex_by_gene=exonsBy(txdb,'gene')
> toc=data.frame(rep(NA,length(tx_by_gene)))
> for(i in 1:length(bam_files))
> {
>    toc[,i]=countOverlaps(tx_by_gene,bam_files[[i]])
> }
> rownames(toc)=names(tx_by_gene)
> colnames(toc)=names(gr_list)
> dim(toc)
> head(toc)
>
> But it doesn't seem to work, failing in two places:
>
> 1) Error in .class1(object) : could not find function "readGappedAlignments"
>
> 2) Download the ensGene table ... Error in function (type, msg, asError = TRUE)  : couldn't connect to host
>
> For the point 1) I found that the function is deprecated but I still can't make things work changing it for the new function that is supposed to replace it "readGAlignmentsFromBam", Can anybody help me fixing this?
>
> The error in part 2) is completely out of my reach at the moment, I don't know how to fix this issue either, so any help would be appreciated.
>
> Thanks in advance!
>
>
>
>   -- output of sessionInfo():
>
>
> R version 3.1.0 (2014-04-10)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United Kingdom.1252
> [2] LC_CTYPE=English_United Kingdom.1252
> [3] LC_MONETARY=English_United Kingdom.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United Kingdom.1252
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
>   [1] GenomicFeatures_1.16.2 AnnotationDbi_1.26.0   Biobase_2.24.0
>   [4] Rsamtools_1.16.0       Biostrings_2.32.0      XVector_0.4.0
>   [7] GenomicRanges_1.16.3   GenomeInfoDb_1.0.2     IRanges_1.22.8
> [10] BiocGenerics_0.10.0
>
> loaded via a namespace (and not attached):
>   [1] BatchJobs_1.2           BBmisc_1.6              BiocParallel_0.6.1
>   [4] biomaRt_2.20.0          bitops_1.0-6            brew_1.0-6
>   [7] BSgenome_1.32.0         codetools_0.2-8         DBI_0.2-7
> [10] digest_0.6.4            fail_1.2                foreach_1.4.2
> [13] GenomicAlignments_1.0.1 iterators_1.0.7         plyr_1.8.1
> [16] Rcpp_0.11.1             RCurl_1.95-4.1          RSQLite_0.11.4
> [19] rtracklayer_1.24.2      sendmailR_1.1-2         stats4_3.1.0
> [22] stringr_0.6.2           tools_3.1.0             XML_3.98-1.1
> [25] zlibbioc_1.10.0
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> ________________________________________________________________________
> devteam-bioc mailing list
> To unsubscribe from this mailing list send a blank email to
> devteam-bioc-leave at lists.fhcrc.org
> You can also unsubscribe or change your personal options at
> https://lists.fhcrc.org/mailman/listinfo/devteam-bioc
>


-- 
Valerie Obenchain
Program in Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, Seattle, WA 98109

Email: vobencha at fhcrc.org
Phone: (206) 667-3158



More information about the Bioconductor mailing list