[BioC] MEDIPS.createSet error

Vining, Kelly Kelly.Vining at oregonstate.edu
Tue Apr 1 00:05:43 CEST 2014

Thanks, Steve, for your advice - appreciate you working with a package that isn't familiar to you.
Thanks also for catching that I didn't have Rsamtools installed.

I think this output that you suggested I generate gives an idea about what might be going on:

> si.bsg
Seqinfo of length 19
seqnames seqlengths isCircular genome
Chr01      50495391      FALSE    3.0
Chr02      25263035      FALSE    3.0
Chr03      21816808      FALSE    3.0
Chr04      24267051      FALSE    3.0
Chr05      25890704      FALSE    3.0
...             ...        ...    ...
Chr15      15278577      FALSE    3.0
Chr16      14494361      FALSE    3.0
Chr17      16080358      FALSE    3.0
Chr18      16958300      FALSE    3.0
Chr19      15942145      FALSE    3.0
> si.bam
Seqinfo of length 1446
seqnames      seqlengths isCircular genome
Chr01           50495391       <NA>   <NA>
Chr02           25263035       <NA>   <NA>
Chr03           21816808       <NA>   <NA>
Chr04           24267051       <NA>   <NA>
Chr05           25890704       <NA>   <NA>
...                  ...        ...    ...
scaffold_3584       3654       <NA>   <NA>
scaffold_3595       2008       <NA>   <NA>
scaffold_3648       2796       <NA>   <NA>
scaffold_3664       3053       <NA>   <NA>
scaffold_3681       1022       <NA>   <NA>

When I created the reference genome structure, the scaffolds were all entered in a single, multiple seq object, following instructions. But in my bam files, the scaffold hits are as shown above. So that could be one problem, but I don't know how to solve that other than filtering out all of the alignments to the scaffolds from the bam files. I really would like to retain the scaffold alignments and not lose all of that precious data. 

The other potential issue is the "NA" entries in the "isCircular" and "genome" columns. I'm not sure whether those would be a problem.

Continued thanks,

From: Steve Lianoglou [lianoglou.steve at gene.com]
Sent: Monday, March 31, 2014 1:15 PM
To: Vining, Kelly
Cc: Lukas Chavez; bioconductor at r-project.org
Subject: Re: [BioC] MEDIPS.createSet error


Caveat being that I've never used MEDIPS, and I'm just going along with
the vignette.

So, comments inline:

On 31 Mar 2014, at 13:09, Vining, Kelly wrote:

> Hi,
> Thanks for the advice thus far! To confirm what is in my BSgenome
> variable, I did this:
>> class(BSgenome)
> [1] "BSgenome"
> attr(,"package")
> [1] "BSgenome"
>> names(BSgenome)
> [1] "Chr01" "Chr02" "Chr03" "Chr04" "Chr05" "Chr06" "Chr07" "Chr08"
> "Chr09"
> [10] "Chr10" "Chr11" "Chr12" "Chr13" "Chr14" "Chr15" "Chr16" "Chr17"
> "Chr18"
> [19] "Chr19" "scaf"
> And then:
>> print(BSgenome)
> Black cottonwood genome
> |
> | organism: Populus trichocarpa (Black cottonwood)
> | provider: Phytozome (JGI)
> | provider version: 3.0
> | release date: January 2010
> | release name: Populus trichocarpa v3.0
> |
> | single sequences (see '?seqnames'):
> |   Chr01  Chr02  Chr03  Chr04  Chr05  Chr06  Chr07  Chr08  Chr09
> Chr10  Chr11
> |   Chr12  Chr13  Chr14  Chr15  Chr16  Chr17  Chr18  Chr19
> |
> | multiple sequences (see '?mseqnames'):
> |   scaf
> |
> | (use the '$' or '[[' operator to access a given sequence)
> So that looks ok. Interestingly, when I followed the vignette and did
> the equivalent of
> BSgenome="BSgenome.Hsapiens.UCSC.hg19"

The vignette suggests that BSgenome should be the package name of the
BSgenome package to open, so yours should be something like
"BSgenome.Ptrichocarpa.XXX.YY" or something -- I guess you built this
packge by yourself, or something, so you'd know its name ...

> That didn't work for me. It only worked without quotes. If I included
> quotes, it just assigned a character vector to that variable.

Sorry, what exactly didn't work for you? Can you show me the code that

> Then, following your advice:
>> si.bsg <- seqinfo(BSgenome.Ptrichocarpa.Phytozome.v3)
>> si.bam <- seqinfo(BamFile("FallBud_brep1_aln_sorted.bam"))
> Error in seqinfo(BamFile("FallBud_brep1_aln_sorted.bam")) :
> error in evaluating the argument 'x' in selecting a method for
> function 'seqinfo': Error: could not find function "BamFile"

The BamFile function is defined in the Rsamtools package, you need to
load that first (the first line of code I suggested you run was to load
the Rsamtools package). Load it first, then redo the
seqinfo(BamFile(...)) stuff.


More information about the Bioconductor mailing list