[Rd] Julia

oliver oliver at first.in-berlin.de
Thu Mar 8 23:22:52 CET 2012


I don't think that using in-place modification as a general property would make
sense.

In-place modification brings in side-effects and that would mean that
the order of evaluation can change the result.

To get reliable results, the order of evaluation should not be
the reason for different results, and thats the reason, why
the functional approach is much better for reliable programs.

So, in general I would say, this feature is a no-no.
In general I would rather discourage in-place modification.

For some certain cases it might help...
but for such certain cases either such a boolean flag
or programming a sparate module in C would make sense.

There could also be a global in-place-flag that might be used (via options
maybe) but if such a thing would be implemented, the default value should be
FALSE.



Ciao,
   Oliver


On Thu, Mar 08, 2012 at 04:21:42PM +0000, William Dunlap wrote:
> So you propose an inplace=TRUE/FALSE entry for each
> argument to each function which may may want to avoid
> allocating memory?  The major problem is that the function
> writer has no idea what the value of inplace should be,
> as it depends on how the function gets called.  This makes
> writing reusable functions (hence packages) difficult.
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> > -----Original Message-----
> > From: oliver [mailto:oliver at first.in-berlin.de]
> > Sent: Thursday, March 08, 2012 7:40 AM
> > To: William Dunlap
> > Cc: R-devel
> > Subject: Re: [Rd] Julia
> > 
> > Ah, and you mean if it's an anonymous array it could be reused directly from the
> > args.
> > 
> > OK, now I see why you insist on the anonymous data thing.
> > I didn't grasped it even in my last mail.
> > 
> > 
> > 
> > But that somehow also relates to what I wrote about reusing an already
> > existing, named vector.
> > 
> > Just the moment of in-place-modification is different.
> > 
> > From
> >   x  <- runif(n)
> >   cx <- cos(x)
> > 
> > instead of
> > > >     cx <- cos(x=runif(n)) # no allocation needed, use the input
> > > > space for the return value
> > 
> > to something like
> > 
> >   cx  <- runif(n)
> >   cos( cx, inplace=TRUE)
> > 
> > or
> > 
> >   cos( runif(n), inplace=TRUE)
> > 
> > 
> > 
> > 
> > This way it would be possible to specify the reusage of the input *explicitly*
> > (without  implicit rules like anonymous vs. named values).
> > 
> > 
> > 
> > In Pseudo-Code something like that:
> > 
> >    if (in_place == TRUE )
> >    {
> >      input_val[idx] = cos( input_val[idx] );
> >      return input_val;
> >    }
> >    else
> >    {
> >      result_val = alloc_vec( LENGTH(input_val), ... );
> >      result_val[idx] = cos( input_val[idx] );
> >      return result_val;
> >    }
> > 
> > 
> > 
> > Is this matching, what you were looking for?
> > 
> > 
> > Ciao,
> >    Oliver
> > 
> > 
> > On Thu, Mar 08, 2012 at 02:56:24PM +0100, oliver wrote:
> > > Hi,
> > >
> > > ok, thank you for clarifiying what you meant.
> > > You only referred to the reusage of the args, not of an already
> > > existing vector.
> > > So I overgenerealized your example.
> > >
> > > But when looking at your example,
> > > and how I would implement the cos()
> > > I doubt I would use copying the args
> > > before calculating the result.
> > >
> > > Just allocate a result-vector, and then place the cos() of the
> > > input-vector into the result vector.
> > >
> > > I didn't looked at how it is done in R, but I would guess it's like
> > > that.
> > >
> > >
> > >   In pseudo-Code something like that:
> > >     cos_val[idx] = cos( input_val[idx] );
> > >
> > > But R also handles complex data with cos() so it will look a bit more
> > > laborious.
> > >
> > > What I have seen so far from implementing C-extensions for R is rather
> > > C-ish, and so you have the control on many details. Copying the input
> > > just to read it would not make sense here.
> > >
> > > I doubt that R internally is doing that.
> > > Or did you found that in the R-code?
> > >
> > > The other problem, someone mentioned, was *changing* the contents of a
> > > matrix... and that this is NO>T done in-place, when using a function
> > > for it.
> > > But the namespace-name / variable-name as "references" to the matrix
> > > might solve that problem.
> > >
> > >
> > > Ciao,
> > >   Oliver
> > >
> > >
> > >
> > > On Wed, Mar 07, 2012 at 07:10:43PM +0000, William Dunlap wrote:
> > > > No my examples are what I meant.  My point was that a function, say
> > > > cos(), can act like it does call-by-value but conserve memory when
> > > > it can  if it can distinguish between the case
> > > >     cx <- cos(x=runif(n)) # no allocation needed, use the input
> > > > space for the return value and and the case
> > > >    x <- runif(n)
> > > >    cx <- cos(x=x) # return value cannot reuse the argument's memory, so
> > allocate space for return value
> > > >    sum(x)              # Otherwise sum(x) would return sum(cx)
> > > > The function needs to know if a memory block is referred to by a
> > > > name in any environment in order to do that.
> > > >
> > > > Bill Dunlap
> > > > Spotfire, TIBCO Software
> > > > wdunlap tibco.com
> > > >
> > > > > -----Original Message-----
> > > > > From: oliver [mailto:oliver at first.in-berlin.de]
> > > > > Sent: Wednesday, March 07, 2012 10:22 AM
> > > > > To: Dominick Samperi
> > > > > Cc: William Dunlap; R-devel
> > > > > Subject: Re: [Rd] Julia
> > > > >
> > > > > On Tue, Mar 06, 2012 at 12:49:32PM -0500, Dominick Samperi wrote:
> > > > > > On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap
> > > > > > <wdunlap at tibco.com>
> > > > > wrote:
> > > > > > > S (and its derivatives and successors) promises that functions
> > > > > > > will not change their arguments, so in an expression like
> > > > > > >   val <- func(arg)
> > > > > > > you know that arg will not be changed.  You can do that by
> > > > > > > having func copy arg before doing anything, but that uses
> > > > > > > space and time that you want to conserve.
> > > > > > > If arg is not a named item in any environment then it should
> > > > > > > be fine to write over the original because there is no way the
> > > > > > > caller can detect that shortcut.  E.g., in
> > > > > > >    cx <- cos(runif(n))
> > > > > > > the cos function does not need to allocate new space for its
> > > > > > > output, it can just write over its input because, without a
> > > > > > > name attached to it, the caller has no way of looking at what
> > > > > > > runif(n) returned.  If you did
> > > > > > >    x <- runif(n)
> > > > > > >    cx <- cos(x)
> > > > >
> > > > > You have two names here, x and cx, hence your example does not fit
> > > > > into what you want to explain.
> > > > >
> > > > > A better example would be:
> > > > > x <- runif(n)
> > > > > x <- cos(x)
> > > > >
> > > > >
> > > > >
> > > > > > > then cos would have to allocate new space for its output
> > > > > > > because overwriting its input would affect a subsequent
> > > > > > >    sum(x)
> > > > > > > I suppose that end-users and function-writers could learn to
> > > > > > > live with having to decide when to copy, but not having to
> > > > > > > make that decision makes S more pleasant (and safer) to use.
> > > > > > > I think that is a major reason that people are able to share S
> > > > > > > code so easily.
> > > > > >
> > > > > > But don't forget the "Holy Grail" that Doug mentioned at the
> > > > > > start of this thread: finding a flexible language that is also
> > > > > > fast. Currently many R packages employ C/C++ components to
> > > > > > compensate for the fact that the R interpreter can be slow, and
> > > > > > the pass-by-value semantics of S provides no protection here.
> > > > > [...]
> > > > >
> > > > > The distinction imperative vs. functional has nothing to do with
> > > > > the distinction interpreted vs. directly executed.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Thinking again on the problem that was mentioned here, I think it
> > > > > might be circumvented.
> > > > >
> > > > > Looking again at R's properties, looking again into U.Ligges
> > > > > "Programmieren in R", I saw there was mentioned that in R anything
> > > > > (?!) is an object... so then it's OOP; but also it was mentioned,
> > > > > R is a functional language. But this does not mean it's purely functional or
> > has no imperative data structures.
> > > > >
> > > > > As R relies heavily on vectors, here we have an imperative datastructure.
> > > > >
> > > > > So, it rather looks to me that "<-" does work in-place on the vectors, even
> > "<-"
> > > > > itself is a function (which does not matter for the problem).
> > > > >
> > > > > If thats true (I assume here, it is; correct me, if it's wrong),
> > > > > then I think, assigning with "<<-" and assign() also would do an
> > > > > imperative
> > > > > (in-place) change of the contents.
> > > > >
> > > > > Then the copying-of-big-objects-when-passed-as-args problem can be
> > > > > circumvented by working on either a variable in the GlobalEnv (and
> > > > > using "<<-", or using a certain environment for the big data and
> > > > > passing it's name (and the
> > > > > variable) as value to the function which then uses assign() and
> > > > > get() to work on that data.
> > > > > Then in-place modification should be possible.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > In 2008 Ross Ihaka and Duncan Temple Lang published the paper
> > > > > > "Back to the Future: Lisp as a base for a statistical computing
> > > > > > system" where they propose Common Lisp as a new foundation for
> > > > > > R. They suggest that this could be done while maintaining the same
> > familiar R syntax.
> > > > > >
> > > > > > A key requirement of any strategy is to maintain easy access to
> > > > > > the huge universe of existing C/C++/Fortran numerical and
> > > > > > graphics libraries, as these libraries are not likely to be rewritten.
> > > > > >
> > > > > > Thus there will always be a need for a foreign function
> > > > > > interface, and the problem is to provide a flexible and
> > > > > > type-safe language that does not force developers to use another
> > > > > > unfamiliar, less flexible, and error-prone language to optimize the hot
> > spots.
> > > > >
> > > > > If I here "type safe" I rather would think about OCaml or maybe
> > > > > Ada, but not LISP.
> > > > >
> > > > > Also, LISP has so many "("'s and ")"'s, that it's making people
> > > > > going crazy ;-)
> > > > >
> > > > > Ciao,
> > > > >    Oliver
> > >
> > > ______________________________________________
> > > R-devel at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list