[BioC] QuasR: how to use an indexed reference genome?

Paul Shannon paul.thurmond.shannon at gmail.com
Fri May 17 16:51:17 CEST 2013


Thanks, Michael.

May I cast my one vote, as a new user of Quasr, for an extra measure of transparency?  I have lost a few days to my confusion.

My reasoning -- which might be idiosyncratic, and which you may find unconvincing -- is that if qAlign, in some invocations (but not all) needs to spend a few hours creating an index, and writes it to a place which is not entirely obvious (lib.loc), then I as a user am mightily confused.  As I have been! :}

I would have been much better off  if qAlign had told me:

  1) you specified a genome package by name for your reference
  2) in order to use that, qAlign needs for it to be indexed
  3) we cannot find an indexed version of that genome in your R library or any lib.loc directory
  4) please specify an alternative lib.loc, and explicit path, or…
  5) create a new index for your genome using this command:
        buildIndexAsPackage(genomePackageName, destinationDir)  # or some such method call
  6) you can then specify that newly-built index package in subsequent calls to qAlign this way:
        qAlign(samplesFile, genome="~/path/to/genome-index-package.tar.gz")

 - Paul

On May 17, 2013, at 6:59 AM, Michael Stadler wrote:

> Hi Paul,
> 
> Please find my answers below:
> 
> On 17.05.2013 15:01, Paul Shannon wrote:
>> Hi Michael,
>> 
>> Thanks for your quick and clarifying response. 
>> 
>> Since it is not possible to use pre-built indices from the bowtie developers, I would be glad to have a small recipe (perhaps featured prominently in the vignette?) which
>> 
>>  1) explains the need for custom-built indices
>>  2) provides (perhaps) a standalone QuasR-specific command for creating one
> It is actually much simpler than you expect: qAlign() creates the index
> automatically if it does not yet exist. The index is then saved in a
> default location (as a new R package if your reference is a BSgenome, or
> else in the same directory containing the fasta reference), and will be
> automatically re-used when qAlign is called with the same reference.
> 
>> My somewhat fuzzy grasp of the current approach is that 
>> 
>>  1) QuasR sees the string "BSgenome.Hsapiens.UCSC.hg19" on a call to qAlign
>>  2) QuasR then spends a few hours building a new package with the proper index
> Yes, this is described in section 5.3 of the vignette.
> 
>>  3) and saves this package somewhere (I could not figure out where)
> This is described in the documentation to qAlign. I agree that it would
> be better to have this all described more explicitly in a single place,
> so I added a description to the qAlign documentation (available shortly
> in the development branch).
> 
> I hope this makes it all clear.
> 
> Best,
> Michael
> 



More information about the Bioconductor mailing list