[Rd] Runnable R packages

David Lindelof ||nde|o| @end|ng |rom |eee@org
Thu Jan 31 15:32:01 CET 2019


Belated thanks to all who replied to my initial query. In summary, three
approaches have been mentioned to run R code "in production": 1)
ShinyProxy, mentioned by Tobias, for deploying Shiny applications; 2)
Docker-like solutions, mentioned by Gergely and Iñaki; and 3) Solutions
based on Rscript or littler, mentioned by Dirk.

I can't speak to 1) because I don't currently use Shiny. And it seems to me
that Docker-like solutions will still need some "point of entry" for the R
application, which will have to be Rscript or littler.

In my first email, I observed that Rscript expects a single expression or a
single script, which is probably why (in my experience) many data
scientists tend to provide their code in a very limited number of files.
Gergely disagreed, arguing to the contrary that data scientists are
encouraged to provide their application as an R package called by a short
script executed by Rscript. But this doesn't happen where I work for
several reasons:

   - it implies installing your package on the production machine(s),
   including its dependencies, which must be done by hand
   - some machine learning platforms will simply not accept code provided
   as an R package
   - we have some "big data" use cases for which we need Spark; Spark can
   run R or Python code, but only when it is provided as a single file. (On
   the other hand, Spark can run applications provided as JAR files)

In summary, I'm convinced R would benefit from something similar to Java's
`Main-Class` header or Python's `__main__()` function. A new R CMD command
would take a package, install its dependencies, and run its "main"
function. If we have this machinery available, we could even consider
reaching out to Spark (and other tech stacks) developers and make it easier
to develop R applications for those platforms.

A candid comment from Dirk suggested that I should implement this myself,
which I would be happy to do, provided this is the normal procedure. Or is
there a more formal process I should follow?

Kind regards,

David Lindelöf

	[[alternative HTML version deleted]]



More information about the R-devel mailing list