[BioC] how to build a R package with the inclusion of inst/extdata

Yue Li gorillayue at gmail.com
Fri Sep 7 03:25:23 CEST 2012

Sorry Steve, I'm actually stuck at building the package with inst/extdata. This is my first time trying to build a R package, so please bear with me. Let me walk you through my (incorrect) approach:

I have a set of R scripts and Rd files that need to be built into a package. I deliberately make all examples in my Rd files trivial such as simply running ls() to pass the test.  I can successfully build the package by running the following steps:

(1) construct package skeleton in R console:
scriptDir <- "~/Desktop/myRscripts/"

outDir <- "~/Desktop/"

sourceFiles <- list.files(path=scriptDir, pattern="[a-zA-Z]+\\.R$", full.names=TRUE, recursive=TRUE)

package.skeleton(name="mypackage", code_files=sourceFiles, path=outDir)

I now have a folder named "mypackage" sitting on my ~/Desktop. In a shell script, I do this:

(2) replace the skeleton Rd files in ~/Desktop/mypackage/man with my prepared Rd files by:

cp ~/Desktop/myRDfiles/*.Rd ~/Desktop/mypackage/man/

(3) R CMD build ~/Desktop/mypackage

(4) R CMD check ~/Desktop/mypackage_0.99.0.tar.gz

(5) R CMD INSTALL ~/Desktop/mypackage_0.99.0.tar.gz

All of the above steps work fine. But now I at the stage of writing concrete examples for each function and use R CMD check in step (4) to make sure that the examples do get run successfully during compilation time. Some of the examples involve using BAM files and I need to put them into the package so that the package gets shipped with these BAM files as test data exactly as the ShortRead package.

I learn that creating a subdirectory called "inst/extdata" inside the package folder (as in ShortRead) is a conventional way to put the test data in . So after step (2), I do this

cp inst/extdata ~/Desktop/mypackage

But then I cannot successfully perform (3) as it returns error:

$ R CMD build mypackage/
* checking for file ‘mypackage/DESCRIPTION’ ... OK
* preparing ‘mypackage’:
* checking DESCRIPTION meta-information ... OK
* excluding invalid files
Subdirectory 'man' contains invalid file names:
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* building ‘mypackage_0.99.0.tar.gz’
/usr/bin/gnutar: mypackage/inst/extdata/expt1/accepted_hits_noDup.bam: file changed as we read it
/usr/bin/gnutar: mypackage/inst/extdata/expt2/accepted_hits_noDup.bam: file changed as we read it
/usr/bin/gnutar: mypackage/inst/extdata/expt3/accepted_hits_noDup.bam: file changed as we read it
packaging into .tar.gz failed

I'm just wondering at which step between (1) and (5) could I somehow incorporate the inst/extdata into the package and make the tar ball containing the inst/extdata.

Thanks much for your patient helps!


On 2012-09-06, at 7:50 PM, Steve Lianoglou <mailinglist.honeypot at gmail.com> wrote:

> Hi,
> On Thu, Sep 6, 2012 at 7:21 PM, Yue Li <gorillayue at gmail.com> wrote:
>> Hi Steven,
>> Thanks for the quick response. I think I probably didn't articulate my intend clearly.
> I actually understood your intent -- I thought you were confused on
> why you were getting some error when you ran the `R CMD build ...`
> command you posted previously.
> The problem was that you were trying to build something that wasn't
> really a package -- it seemed as if you were trying to build the
> *parent* directory your package directory was living in.
>> Basically, I'm trying to develop a R package rather than using someone else's package. In order to run some examples I have for the functions I wrote, I need to have BAM data saved in the "inst/extdata" (or anywhere for that matters). So when I call:
>> R CMD check mypackage
>> The example that says something like
>> testfiles <- system.file("inst/extdata/*bam$", package = "mypackage", )
>> can give me the BAM files saved in that inst/extdata/ that come with the tar ball package. But I'm too ignorant to figure out how to do that.
> If you want to do this pattern matching on *.bam, I'm pretty sure you
> can't do it in a call to system.file, so you'd first get a handle on
> your `extdata` directory, then call `dir` on it. For example (and to
> be extra explicit), assuming you install your package succesfully, you
> would then do in R:
> R> extdata.dir <- system.file("extdata", package="myPackage")
> R> bamfiles <- dir(extdata.dir, pattern="\\.bam$", full.names=TRUE)
> The directory structure of your package would look something like this:
> myPackage
> `- inst
>     `- extdata
>             `- data1.bam
>             `- data2.bam
> `- R
>    `- ...
> And note that when you actually install the package, the contents
> inside the `inst` directory get "hoisted" out of it and dropped into
> the directory of your package, eg. after installation, on your
> filesystem the `extdata` directory would be something like:
> /path/to/your/R/library/myPackage/extdata/
> Download the source code of, say, the ShortRead package to see the
> structure you want to follow:
> http://www.bioconductor.org/packages/2.10/bioc/src/contrib/ShortRead_1.14.4.tar.gz
> HTH,
> -steve
> -- 
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
> | Memorial Sloan-Kettering Cancer Center
> | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact

More information about the Bioconductor mailing list