[BioC] how to build a R package with the inclusion of inst/extdata

Martin Morgan mtmorgan at fhcrc.org
Fri Sep 7 07:02:07 CEST 2012


On 09/06/2012 06:25 PM, Yue Li wrote:
> Sorry Steve, I'm actually stuck at building the package with inst/extdata. This is my first time trying to build a R package, so please bear with me. Let me walk you through my (incorrect) approach:
>
> I have a set of R scripts and Rd files that need to be built into a package. I deliberately make all examples in my Rd files trivial such as simply running ls() to pass the test.  I can successfully build the package by running the following steps:
>
> (1) construct package skeleton in R console:
> scriptDir <- "~/Desktop/myRscripts/"
>
> outDir <- "~/Desktop/"
>
> sourceFiles <- list.files(path=scriptDir, pattern="[a-zA-Z]+\\.R$", full.names=TRUE, recursive=TRUE)
>
> package.skeleton(name="mypackage", code_files=sourceFiles, path=outDir)
>
> I now have a folder named "mypackage" sitting on my ~/Desktop. In a shell script, I do this:
>
> (2) replace the skeleton Rd files in ~/Desktop/mypackage/man with my prepared Rd files by:
>
> cp ~/Desktop/myRDfiles/*.Rd ~/Desktop/mypackage/man/
>
> (3) R CMD build ~/Desktop/mypackage
>
> (4) R CMD check ~/Desktop/mypackage_0.99.0.tar.gz
>
> (5) R CMD INSTALL ~/Desktop/mypackage_0.99.0.tar.gz
>
>
> All of the above steps work fine. But now I at the stage of writing concrete examples for each function and use R CMD check in step (4) to make sure that the examples do get run successfully during compilation time. Some of the examples involve using BAM files and I need to put them into the package so that the package gets shipped with these BAM files as test data exactly as the ShortRead package.
>
> I learn that creating a subdirectory called "inst/extdata" inside the package folder (as in ShortRead) is a conventional way to put the test data in . So after step (2), I do this
>
> cp inst/extdata ~/Desktop/mypackage

This is a bit unusual -- cp inst/extdata should complain that you're 
trying to copy a directory and instead you should use cp -r. I draw 
attention to this because otherwise it sounds like you've done things 
correctly...

>
>
> But then I cannot successfully perform (3) as it returns error:
>
> $ R CMD build mypackage/
> * checking for file ‘mypackage/DESCRIPTION’ ... OK
> * preparing ‘mypackage’:
> * checking DESCRIPTION meta-information ... OK
> * excluding invalid files
> Subdirectory 'man' contains invalid file names:
>    ‘.Rhistory’
> * checking for LF line-endings in source and make files
> * checking for empty or unneeded directories
> * building ‘mypackage_0.99.0.tar.gz’
> /usr/bin/gnutar: mypackage/inst/extdata/expt1/accepted_hits_noDup.bam: file changed as we read it
> /usr/bin/gnutar: mypackage/inst/extdata/expt2/accepted_hits_noDup.bam: file changed as we read it
> /usr/bin/gnutar: mypackage/inst/extdata/expt3/accepted_hits_noDup.bam: file changed as we read it

these messages are unusual. It looks to me like your package structure 
is correct, and that tar is failing because of some unfortunate 
interaction with your file system.

Are these files bam files large? A first suggestion would be to try with 
smaller 'toy' files, e.g., and assuming you have backups

   rm -rf mypackage/inst/extdata/expt*
   touch mypackage/inst/extdata/toy.file

also, might as well clean up while we're at it

   rm mypackage/man/.Rhistory

and then

   R CMD build mypackage
   R CMD INSTALL mypackage_0.99.0.tar.gz

and then in R

   library(mypackage)
   extdata.dir = system.file(package="mypackage", "extdata")
   dir(extdata.dir, full=TRUE)

Martin

>   ERROR
> packaging into .tar.gz failed
>
>
> I'm just wondering at which step between (1) and (5) could I somehow incorporate the inst/extdata into the package and make the tar ball containing the inst/extdata.
>
> Thanks much for your patient helps!
> Yue
>
>
>
>
>
>
>
>
>
>
> On 2012-09-06, at 7:50 PM, Steve Lianoglou <mailinglist.honeypot at gmail.com> wrote:
>
>> Hi,
>>
>> On Thu, Sep 6, 2012 at 7:21 PM, Yue Li <gorillayue at gmail.com> wrote:
>>> Hi Steven,
>>>
>>> Thanks for the quick response. I think I probably didn't articulate my intend clearly.
>>
>> I actually understood your intent -- I thought you were confused on
>> why you were getting some error when you ran the `R CMD build ...`
>> command you posted previously.
>>
>> The problem was that you were trying to build something that wasn't
>> really a package -- it seemed as if you were trying to build the
>> *parent* directory your package directory was living in.
>>
>>> Basically, I'm trying to develop a R package rather than using someone else's package. In order to run some examples I have for the functions I wrote, I need to have BAM data saved in the "inst/extdata" (or anywhere for that matters). So when I call:
>>>
>>> R CMD check mypackage
>>>
>>> The example that says something like
>>>
>>> testfiles <- system.file("inst/extdata/*bam$", package = "mypackage", )
>>>
>>> can give me the BAM files saved in that inst/extdata/ that come with the tar ball package. But I'm too ignorant to figure out how to do that.
>>
>> If you want to do this pattern matching on *.bam, I'm pretty sure you
>> can't do it in a call to system.file, so you'd first get a handle on
>> your `extdata` directory, then call `dir` on it. For example (and to
>> be extra explicit), assuming you install your package succesfully, you
>> would then do in R:
>>
>> R> extdata.dir <- system.file("extdata", package="myPackage")
>> R> bamfiles <- dir(extdata.dir, pattern="\\.bam$", full.names=TRUE)
>>
>> The directory structure of your package would look something like this:
>>
>> myPackage
>> `- inst
>>      `- extdata
>>              `- data1.bam
>>              `- data2.bam
>> `- R
>>     `- ...
>> `- NAMESPACE
>> `- DESCRIPTION
>>
>> And note that when you actually install the package, the contents
>> inside the `inst` directory get "hoisted" out of it and dropped into
>> the directory of your package, eg. after installation, on your
>> filesystem the `extdata` directory would be something like:
>>
>> /path/to/your/R/library/myPackage/extdata/
>>
>> Download the source code of, say, the ShortRead package to see the
>> structure you want to follow:
>>
>> http://www.bioconductor.org/packages/2.10/bioc/src/contrib/ShortRead_1.14.4.tar.gz
>>
>> HTH,
>> -steve
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>> | Memorial Sloan-Kettering Cancer Center
>> | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list