[Rd] On implementing zero-overhead code reuse

Kynn Jones kynnjo at gmail.com
Mon Oct 3 19:51:32 CEST 2016


Thank you all for your comments and suggestions.

@Frederik, my reason for mucking with environments is that I want to
minimize the number of names that import adds to my current
environment.  For instance, if module foo defines a function bar, I
want my client code to look like this:

  import("foo")
  foo$bar(1,2,3)

rather than

  import("foo")
  bar(1,2,3)

(Just a personal preference.)

@Dirk, @Kasper, as I see it, the benefit of scripting languages like
Python, Perl, etc., is that they allow very quick development, with
minimal up-front cost.  Their main strength is precisely that one can,
without much difficulty, *immediately* start *programming
productively*, without having to worry at all about (to quote Dirk)
"repositories.  And package management.  And version control (at the
package level).  And ... byte compilation.  And associated
documentation.  And unit tests.  And continuous integration."

Of course, *eventually*, and for a fraction of one's total code base
(in my case, a *very small* fraction), one will want to worry about
all those things, but I see no point in burdening *all* my code with
all those concerns from the start.  Again, please keep in mind that
those concerns come into play for at most 5% of the code I write.

Also, I'd like to point out that the Python, Perl, etc. communities
are no less committed to all the concerns that Dirk listed (version
control, package management, documentation, testing, etc.) than the R
community is.  And yet, Python, Perl, etc. support the "zero-overhead"
model of code reuse.  There's no contradiction here.  Support for
"zero-overhead" code reuse does not preclude forms of code reuse with
more overhead.

One benefit the zero-overhead model is that the concerns of
documentation, testing, etc. can be addressed with varying degrees of
thoroughness, depending on the situation's demands.  (For example,
documentation that would be perfectly adequate for me as the author of
a function would not be adequate for the general user.)

This means that the transition from writing private code to writing
code that can be shared with the world can be made much more
gradually, according to the programmer's needs and means.

Currently, in the R world, the choice for programmers is much starker:
either stay writing little scripts that one sources from an
interactive session, or learn to implement packages.  There's too
little in-between.

Of course, from the point of view of someone who has already written
several packages, the barrier to writing a package may seem too small
to fret over, but adopting the expert's perspective is likely to
result in excluding the non-experts.

Best, kj


On Mon, Oct 3, 2016 at 12:06 PM, Kasper Daniel Hansen
<kasperdanielhansen at gmail.com> wrote:
>
>
> On Mon, Oct 3, 2016 at 10:18 AM, <frederik at ofb.net> wrote:
>>
>> Hi Kynn,
>>
>> Thanks for expanding.
>>
>> I wrote a function like yours when I first started using R. It's
>> basically the same up to your "new.env()" line, I don't do anything
>> with environmentns. I just called my function "mysource" and it's
>> essentially a "source with path". That allows me to find code I reuse
>> in standard locations.
>>
>> I don't know why R does not have built-in support for such a thing.
>> You can get it in C compilers with CPATH, and as you say in Perl with
>> PERL5LIB, in Python, etc. Obviously when I use my "mysource" I have to
>> remember that my code is now not portable without copying over some
>> files from other locations in my home directory. However, as a
>> beginner I find this tool to be indispensable, as R lacks several
>> functions which I use regularly, and I'm not necessarily ready to
>> confront the challenges associated with creating a package.
>
>
> I can pretty much guarantee that when you finally confront the "challenge"
> of making your own package you'll realize (1) it is pretty easy if the
> intention is only to use it yourself (and perhaps a couple of collaborators)
> - by easy I mean I can make a package in 5m max. (2) you'll ask yourself
> "why didn't I do this earlier?".  I still get that feeling now, when I have
> done it many times for internal use.  Almost every time I think I should
> have made an internal package earlier in the process.
>
> Of course, all of this is hard to see when you're standing in the middle of
> your work.
>
> Best,
> Kasper
>
>
>
>
>
>>
>> However, I guess since we can get your functionality pretty easily
>> using some lines in .Rprofile, that makes it seem less important to
>> have it built-in. In fact, if everyone has to implement their own
>> version of your "import", this almost guarantees that the function
>> won't appear by accident in any public code. My choice of name
>> "mysource" was meant to serve as a more visible lexical reminder that
>> the function is not meant to be seen by the public.
>>
>> By the way, why do you do the stuff with environments in your "import"
>> function?
>>
>> Dirk's take is interesting. I don't use version control for my
>> personal projects, just backing-up. Obviously not all R users are
>> interested in becoming package maintainers, in fact I think it would
>> clutter things a bit if this were the case. Or maybe it would be good
>> to have everyone publish their personal utility functions, who knows?
>> Anyway I appreciate Dirk's arguments, but I'm also a bit surprised
>> that Kynn and I seem to be the only ones who have written personal
>> functions to do what Kynn calls "zero-overhead code reuse". FWIW.
>>
>> Cheers,
>>
>> Frederick
>>
>> On Sun, Oct 02, 2016 at 08:01:58PM -0400, Kynn Jones wrote:
>> > Hi Frederick,
>> >
>> > I described what I meant in the post I sent to R-help
>> > (https://stat.ethz.ch/pipermail/r-help/2016-September/442174.html),
>> > but in brief, by "zero overhead" I mean that the only thing needed for
>> > library code to be accessible to client code is for it to be located
>> > in a designated directory.  No additional meta-files,
>> > packaging/compiling,
>> > etc. are required.
>> >
>> > Best,
>> >
>> > G.
>> >
>> > On Sun, Oct 2, 2016 at 7:09 PM,  <frederik at ofb.net> wrote:
>> > > Hi Kynn,
>> > >
>> > > Do you mind defining the term "zero-overhead model of code reuse"?
>> > >
>> > > I think I understand what you're getting at, but not sure.
>> > >
>> > > Thank you,
>> > >
>> > > Frederick
>> > >
>> > > On Sun, Oct 02, 2016 at 01:29:52PM -0400, Kynn Jones wrote:
>> > >> I'm looking for a way to approximate the "zero-overhead" model of
>> > >> code
>> > >> reuse available in languages like Python, Perl, etc.
>> > >>
>> > >> I've described this idea in more detail, and the motivation for this
>> > >> question in an earlier post to R-help
>> > >> (https://stat.ethz.ch/pipermail/r-help/2016-September/442174.html).
>> > >>
>> > >> (One of the responses I got advised that I post my question here
>> > >> instead.)
>> > >>
>> > >> The best I have so far is to configure my PROJ_R_LIB environment
>> > >> variable to point to the directory with my shared code, and put a
>> > >> function like the following in my .Rprofile file:
>> > >>
>> > >>     import <- function(name){
>> > >>         ## usage:
>> > >>         ## import("foo")
>> > >>         ## foo$bar()
>> > >>         path <- file.path(Sys.getenv("PROJ_R_LIB"),paste0(name,".R"))
>> > >>         if(!file.exists(path)) stop('file "',path,'" does not exist')
>> > >>         mod <- new.env()
>> > >>         source(path,local=mod)
>> > >>         list2env(setNames(list(mod),list(name)),envir=parent.frame())
>> > >>         invisible()
>> > >>     }
>> > >>
>> > >> (NB: the idea above is an elaboration of the one I showed in my first
>> > >> post.)
>> > >>
>> > >> But this is very much of an R noob's solution.  I figure there may
>> > >> already be more solid ways to achieve "zero-overhead" code reuse.
>> > >>
>> > >> I would appreciate any suggestions/critiques/pointers/comments.
>> > >>
>> > >> TIA!
>> > >>
>> > >> kj
>> > >>
>> > >> ______________________________________________
>> > >> R-devel at r-project.org mailing list
>> > >> https://stat.ethz.ch/mailman/listinfo/r-devel
>> > >>
>> >
>>
>> On Sun, Oct 02, 2016 at 08:05:53PM -0400, Kynn Jones wrote:
>> > On Sun, Oct 2, 2016 at 8:01 PM, Kynn Jones <kynnjo at gmail.com> wrote:
>> > > Hi Frederick,
>> > >
>> > > I described what I meant in the post I sent to R-help
>> > > (https://stat.ethz.ch/pipermail/r-help/2016-September/442174.html),
>> > > but in brief, by "zero overhead" I mean that the only thing needed for
>> > > library code to be accessible to client code is for it to be located
>> > > in designed directory.  No additional meta-files, packaging/compiling,
>> >      ^^^^^^^^
>> >
>> > Sorry, I meant to write "designated".
>> >
>> > > etc. are required.
>> >
>>
>> On Sun, Oct 02, 2016 at 07:18:41PM -0500, Dirk Eddelbuettel wrote:
>> >
>> > Kynn,
>> >
>> > How much homework have you done researching any other "alternatives" to
>> > the
>> > package system?  I know of at least one...
>> >
>> > In short, just about everybody here believes in packages. And
>> > repositories.
>> > And package management.  And version control (at the package level). And
>> > maybe byte compilation.  And associated documentation.  And unit tests.
>> > And
>> > continuous integration.
>> >
>> > You don't have to -- that's cool.  Different strokes for different
>> > folks.
>> >
>> > But if think you need something different you may just have to build
>> > that
>> > yourself.
>> >
>> > Cheers, Dirk
>> >
>> > --
>> > http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
>> >
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



More information about the R-devel mailing list