[BioC] BAM files to Genomic Ranges object

jluis.lavin at unavarra.es jluis.lavin at unavarra.es
Tue Jan 15 11:44:54 CET 2013


First of all I want to thank Herve and Steve for their attention to my
questions this time.

@Steve,

Thanks for the links to Bioconductor manuals, I'll try to read and
understand that info as fast & accurately as I'm capable of, trying not to
ask very elementary questions in this list  ;)

@Herve,

Thank you once again for your answer. Once my files were converted into a
GRangesList object, as you said, everything seemed to work fine.
I also closed my previous R session and started a new one, which may have
helped if the other session was corrupted somehow, returning the previous
nosense error.
Here's the final code that works fine in my computer, if its of any help
for future users  ;)

library(Mus.musculus)
library(Repitools)
library(GenomicRanges)
library(Rsamtools)

bam_files <- list.files(pattern="*.bam")
   gr_list <- lapply(bam_files,
                     function(bam_file)
                       as(readGappedAlignments(bam_file), "GRanges"))
   names(gr_list) <- bam_files
  gr_list <- GRangesList(gr_list)

require(BSgenome.Mmusculus.UCSC.mm9)

cpdens <- cpgDensityCalc(gr_list, organism=Mmusculus, window =600)
cpgplot <- cpgDensityPlot(gr_list, seq.len=300, organism=Mmusculus, lwd=4,
verbose=TRUE)

sessionInfo()

R version 2.15.1 (2012-06-22)
Platform: x86_64-redhat-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
 [1] BSgenome.Mmusculus.UCSC.mm9_1.3.19
 [2] BSgenome_1.26.1
 [3] Rsamtools_1.10.2
 [4] Biostrings_2.26.2
 [5] Repitools_1.4.0
 [6] Mus.musculus_1.0.0
 [7] TxDb.Mmusculus.UCSC.mm10.ensGene_2.8.0
 [8] org.Mm.eg.db_2.8.0
 [9] GO.db_2.8.0
[10] RSQLite_0.11.2
[11] DBI_0.2-5
[12] OrganismDbi_1.0.2
[13] GenomicFeatures_1.10.1
[14] GenomicRanges_1.10.5
[15] IRanges_1.16.4
[16] AnnotationDbi_1.20.3
[17] Biobase_2.18.0
[18] BiocGenerics_0.4.0
[19] BiocInstaller_1.8.3

loaded via a namespace (and not attached):
 [1] biomaRt_2.14.0     bitops_1.0-5       edgeR_3.0.8        graph_1.36.1
 [5] limma_3.14.3       parallel_2.15.1    RBGL_1.34.0        RCurl_1.95-3
 [9] rtracklayer_1.18.2 stats4_2.15.1      tools_2.15.1       XML_3.95-0.1
[13] zlibbioc_1.4.0


With best wishes

JL


El Lun, 14 de Enero de 2013, 20:47, Hervé Pagès escribió:
> Hi JL,
>
> On 01/14/2013 08:27 AM, Steve Lianoglou wrote:
>> Hi,
>>
>> On Mon, Jan 14, 2013 at 5:20 AM,  <jluis.lavin at unavarra.es> wrote:
>>> First of all I want to thank Herve Pages and Martin Morgan for their
>>> kind
>>> answers. And I want to apologize for this delay answering back, but I
>>> was
>>> offline for a few days...
>>>
>>> I used Herve code to read my BAM files into GRanges objects, but I
>>> don't
>>> seem to be able to use that Granges object into the cpgDensityCalc
>>> function.
>>> This is how I try to do it:
>>>
>>> -(My files are in the working directory already)
>>>
>>> library(Mus.musculus)
>>> library(Repitools)
>>> library(GenomicRanges)
>>> library(Rsamtools)
>>>
>>>   bam_files <- list.files(pattern="*.bam")
>>>     gr_list <- lapply(bam_files,
>>>                       function(bam_file)
>>>                         as(readGappedAlignments(bam_file), "GRanges"))
>>>     names(gr_list) <- bam_files
>>>
>>> require(BSgenome.Mmusculus.UCSC.mm9)
>>>
>>> cpdens <- cpgDensityCalc(gr_list, organism=Mmusculus, window =600)
>>>
>>> -And I get the following error:
>>>
>>> Error in function (classes, fdef, mtable)  :
>>> unable to find an inherited method for function "cpgDensityCalc", for
>>> signature "character", "BSgenome"
>>>
>>> What does this error mean?
>>
>> It means that your `gr_list` isn't what you think it is -- it's a
>> character, perhaps "try-error"?
>
> When running your code, I get:
>
>    > cpdens <- cpgDensityCalc(gr_list, organism=Mmusculus, window =600)
>    Error in function (classes, fdef, mtable)  :
>      unable to find an inherited method for function ‘cpgDensityCalc’
> for signature ‘"list", "BSgenome"’
>
> Not exactly the same error you get: see "list" in the error message
> I get, versus "character" in the error message you get? Not sure why
> your 'gr_list' ends up being a character vector instead of a list.
>
> Anyway, having it being a list (an ordinary list) doesn't seem to work
> either. The reason the call to cpgDensityCalc() fails is because this
> is a generic function and there are no "cpgDensityCalc" methods that
> will accept the first argument to be a list:
>
>    > cpgDensityCalc
>    standardGeneric for "cpgDensityCalc" defined from package "Repitools"
>
> function (x, organism, ...)
>    standardGeneric("cpgDensityCalc")
>    <environment: 0x717c2a0>
>    Methods may be defined for arguments: x, organism
>    Use  showMethods("cpgDensityCalc")  for currently available ones.
>
>    > showMethods("cpgDensityCalc")
>    Function: cpgDensityCalc (package Repitools)
>    x="data.frame", organism="BSgenome"
>    x="GRanges", organism="BSgenome"
>    x="GRangesList", organism="BSgenome"
>
> Note that the type of input expected by the function is explained in
> its man page (?cpgDensityCalc):
>
>         x: A ‘data.frame’, with columns ‘chr’ and ‘position’, or columns
>            ‘chr’, ‘start’, ‘end’, and ‘strand’.  Also may be a
>            ‘GRangesList’ object, or ‘GRanges’.
>
> So once you've sorted out the reasons why you get a character instead of
> a list of GRanges (I've no clue how this could happen), you'll need to
> turn that ordinary list into a GRangesList object with:
>
>    gr_list <- GRangesList(gr_list)
>
> before calling cpgDensityCalc().
>
> Hope this helps,
> H.
>
>>
>> Look at `head(gr_list)` ... what do you see?
>>
>>> *I'm awfully sorry for my lack of intermediate-advance R scripting
>>> knowledge, I'm trying to fix that...
>>
>> Reading through an Introduction to R might be helpful, if you haven't
>> done so already.
>>
>> Bioconductor uses the S4 class system in R, and it would be helpful to
>> understand some parts of that too. If you look. If you look at
>> previous bioc courses online, you will often find that Martin does a
>> quick intro to these things at the beginning of many sessions:
>>
>> http://bioconductor.org/help/course-materials/
>>
>> Look for courses that were run at The Hutch (in Seattle)
>>
>> Perhaps the intro slides here will bear fruit:
>>
>> http://bioconductor.org/help/course-materials/2010/SeattleIntro/
>>
>> -steve
>>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>


-- 
Dr. José Luis Lavín Trueba

Dpto. de Producción Agraria
Grupo de Genética y Microbiología
Universidad Pública de Navarra
31006 Pamplona
Navarra
SPAIN



More information about the Bioconductor mailing list