[Rd] Runnable R packages

Dirk Eddelbuettel edd @ending from debi@n@org
Mon Jan 7 22:18:03 CET 2019


On 3 January 2019 at 11:43, David Lindelof wrote:
| Dear all,
| 
| I’m working as a data scientist in a major tech company. I have been using
| R for almost 20 years now and there’s one issue that’s been bugging me of
| late. I apologize in advance if this has been discussed before.
| 
| R has traditionally been used for running short scripts or data analysis
| notebooks, but there’s recently been a growing interest in developing full
| applications in the language. Three examples come to mind:
| 
| 1) The Shiny web application framework, which facilitates the developent of
| rich, interactive web applications
| 2) The httr package, which provides lower-level facilities than Shiny for
| writing web services
| 3) Batch jobs run by data scientists according to, say, a cron schedule

That is a bit of a weird classification of "full applications". I have done
this about as long as you but I also provided (at least as tests and demos)
  i)  GUI apps using tcl/tk (which comes with R) and
  ii) GUI apps with Qt (or even Wt), see my RInside package.

But my main weapon for 3) is littler. See

   https://cran.r-project.org/package=littler

and particularly the many examples at

   https://github.com/eddelbuettel/littler/tree/master/inst/examples
 
| Compared with other languages, R’s support for such applications is rather
| poor. The Rscript program is generally used to run an R script or an
| arbitrary R expression, but I feel it suffers from a few problems:
| 
| 1) It encourages developers of batch jobs to provide their code in a single
| R file (bad for code structure and unit-testability)
| 2) It provides no way to deal with dependencies on other packages
| 3) It provides no way to "run" an application provided as an R package

Err, no. See the examples/ directory above. About every single one uses
packages.

As illustrations I have long-running and somewhat visible cronjobs that are
implemented the same way: CRANberries (since 2007, now running hourly) and
CRAN Policy Watch (running once a day). Because both are 'hacks' I never
published the code but there is not that much to it. CRANberries just queries
CRAN, compares to what it had last, and writes out variants of the
DESCRIPTION file to text where a static blog engine (like Hugo, but older)
makes a feed and html pagaes out of it.  Oh, and we tweet because "why not?".
 
| For example, let’s say I want to run a Shiny application that I provide as
| an R package (to keep the code modular, to benefit from unit tests, and to
| declare dependencies properly). I would then need to a) uncompress my R
| package, b) somehow, ensure my dependencies are installed, and c) call
| runApp(). This can get tedious, fast.

Disagree here too. At work, I just write my code, organize it in packages,
update the packages and have shiny expose whatever makes sense.

| Other languages let the developer package their code in "runnable"
| artefacts, and let the developer specify the main entry point. The
| mechanics depend on the language but are remarkably similar, and suggest a
| way to implement this in R. Through declarations in some file, the
| developer can often specify dependencies and declare where the program’s
| "main" function resides. Consider Java:
| 
| Artefact: .jar file
| Declarations file: Manifest file
| Entry point: declared as 'Main-Class'
| Executed as: java -jar <jarfile>
| 
| Or Python:
| 
| Artefact: Python package, typically as .tar.gz source distribution file
| Declarations file: setup.py (which specifies dependencies)
| Entry point: special __main__() function
| Executed as: python -m <package>
| 
| R has already much of this machinery:
| 
| Artefact: R package
| Declarations file: DESCRIPTION
| Entry point: ?
| Executed as: ?
| 
| I feel that R could benefit from letting the developer specify, possibly in
| DESCRIPTION, how to "run" the package. The package could then be run
| through, for example, a new R CMD command, for example:
| 
| R CMD RUN <package> <args>
| 
| I’m sure there are plenty of wrinkles in this idea that need to be ironed
| out, but is this something that has ever been considered, or that is on R’s
| roadmap?

Hm. If _you_ have an itch to scratch here why don't _you_ implement a draft.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org



More information about the R-devel mailing list