[BioC] BAM files to Genomic Ranges object

jluis.lavin at unavarra.es jluis.lavin at unavarra.es
Wed Jan 9 15:08:15 CET 2013


Hello everybody,

I've been told about a very nice R package called Repitools. It has many
interesting features but the use of some of them remains a little unclear
to me yet.
There are some subjects about this tool I'll have to ask about, in this
list, but first of all I need to convert some BAM files into GRanges
objects.
To do so I read about BAM2GenomicRanges function. Reading the package
documentation, I tried to convert 3 BAM files into GRanges. I used the
following instructions:

*(where "/path/to/ANALYSIS/folder/" is the directory where my BAM files are)

bgr <- BAM2GRanges('/path/to/ANALYSIS/folder/',
what = c("rname", "strand", "pos", "qwidth"),
flag = scanBamFlag(isUnmappedQuery = FALSE, isDuplicate = FALSE),
verbose = TRUE)

And I get this error message:

Reading BAM file AW-07.
[bam_header_read] bgzf_check_EOF: Invalid argument
[bam_header_read] invalid BAM binary header (this is not a BAM file).
Error in open.BamFile(BamFile(file, index), "rb") :
  SAM/BAM header missing or empty
  file: '/path/to/ANALYSIS/folder/'

The BAM files I'm using here where produced by a tophat alignment and used
in R previously for other analysis with this R code:

library(Rsamtools)
library(GenomicFeatures)

bamlist=list()
src_files=list.files(pattern="*.bam")

for(filename in src_files)
{
	tmp=readBamGappedAlignments(filename)
	bamlist[[length(bamlist)+1]]=GRanges(seqnames=rname(tmp),
	ranges=IRanges(start=start(tmp),end=end(tmp)),
	strand=rep("*",length(tmp)))
}


names(bamlist)=src_files

My question is, Why are now considered as invalid BAM files when they
previously worked as correct BAM files? Am I mistaken about the meaning of
the error message?Did I miss something in the code?

Thanks in advance for your kind help.

JL



More information about the Bioconductor mailing list