[Rd] Creating a vignette which depends on a non-distributable file

January Weiner january.weiner at gmail.com
Sat May 16 09:39:48 CEST 2015

Dear Martin,

thank you for the food for thought. My package does not depend on MSigDB
(it implements something better than MSigDB), but being able to work with
MSigDB (for comparative purposes) is important. Also, Bioconductor makes
sense only if you really want to take advantage of the Bioconductor
structures / tools, which I don't.

However, I find your suggestion with eval=FALSE and data subsets very good,
I will implement it, using hidden sections to simulate the output, thanks!

Kind regards,


On 15 May 2015 at 01:50, Martin Morgan <mtmorgan at fredhutch.org> wrote:

> On 05/14/2015 04:33 PM, Henrik Bengtsson wrote:
>> On May 14, 2015 15:04, "January Weiner" <january.weiner at gmail.com> wrote:
>>> Dear all,
>>> I am writing a vignette that requires a file which I am not allowed to
>>> distribute, but which the user can easily download manually. Moreover, it
>>> is not possible to download this file automatically from R: downloading
>>> requires a (free) registration that seems to work only through a browser.
>>> (I'm talking here about the MSigDB from the Broad Institute,
>>> http://www.broadinstitute.org/gsea/msigdb/index.jsp).
>>> In the vignette, I tell the user to download the file and then show how
>>> it
>>> can be parsed and used in R. Thus, I can compile the vignette only if
>>> this
>>> file is present in the vignettes/ directory of the package. However, it
>>> would then get included in the package -- which I am not allowed to do.
>>> What should I do?
>>> (1) finding an alternative to MSigDB is not a solution -- there simply is
>>> no alternative.
>>> (2) I could enter the code (and the results) in a verbatim environment
>>> instead of using Sweave. This has obvious drawbacks (for one thing, it
>>> would look incosistent).
> use the chunk argument eval=FALSE instead of placing the code in a
> verbatim argument. See ?RweaveLatex if you're compiling a PDF vignette from
> Rnw or the knitr documentation for (much nicer for users of your vignette,
> in my opinion) Rmd vignettes processed to HTML.
> A common pattern is to process chunks 1, 2, 3, 4, and then there is a
> 'leap of faith' in chunk 5 (with eval=FALSE) and a second chunk (maybe with
> echo=FALSE, eval=TRUE) that reads the _result_ that would have been
> produced by chunk 5 from a serialized instance into the R session for
> processing in chunks 6, 7, 8...
> Also very often while it might make sense to analyse an entire data set as
> part of a typical work flow, for illustrative purposes a much smaller
> subset or simulated data might be relevant; again a strategy would be to
> illustrate the problematic steps with simulated data, and then resume the
> narrative with the analyzed full data.
> A secondary consideration may be that if your package _requires_ MSigDB to
> function, then it can't be automatically tested by repository build
> machines -- you'll want to have unit tests or other approaches to ensure
> that 'bit rot' does not set in without you being aware of it.
> If this is a Bioconductor package, then it's appropriate to ask on the
> Bioconductor devel mailing list.
>   http://bioconductor.org/developers/
> http://bioconductor.org/packages/BiocStyle/ might be your friend for
> producing stylish vignettes.
> Martin
>  (3) I could build vignette outside of the package and put it into the
>>> inst/doc directory. This also has obvious drawbacks.
>>> (4) Leaving this example out defies the purpose of my package.
>>> I am tending towards solution (2). What do you think?
>> Not clear how big of a static piece you're taking about, but maybe you
>> could set it up such that you use (2) as a fallback, i.e. have the
>> vignette
>> include a static/pre-generated piece (which is clearly marked as such)
>> only
>> if the external dependency is not available.
>> Just a thought
>> Henrik
>>> Kind regards,
>>> j.
>>> --
>>> -------- January Weiner --------------------------------------
>>>          [[alternative HTML version deleted]]
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>         [[alternative HTML version deleted]]
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> --
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793

-------- January Weiner --------------------------------------

	[[alternative HTML version deleted]]

More information about the R-devel mailing list