[Rd] PROTECT and OCaml GC.

Mon Nov 30 18:08:31 CET 2009

Simon Urbanek a écrit :
> 
> You're talking about two entirely different things -- bypassing the API 
> is a very bad idea, but it has nothing to do with your last paragraph. 

It's very good to hear that it's two different things. This has been 
quite unclear to me.

> The API gives you access all user-visible aspects of R which is all you 
> really need for any embedding -- that includes closure body, evaluation 
> etc. I see no reason why you should ever go lower than the API

Because I've been unable to find what exactly applyClosure or eval 
requires, when it comes to the structure of the argument LANGSXP. For 
example.

> since 
> that is unreliable and unsupported and thus you won't get any help with 
> that (that is IMHO the main reason why you get no responses here - and I 
> wouldn't expect any). All other functions are hidden on purpose since 
> they cover internal aspects that should not be relied upon.

Please let me be clear on my intentions:

-1- I intend to use only the API if possible.

-2- If not possible, I will perhaps use #define USE_RINTERNALS, which as 
I understand is not part of the API.

-3- The libR.so with opened symbols is intended only as a replacement of 
GDB during development. Unfortunately, as things are not going as easily 
as it could, I am, for gdb-like purposes, writing progressively a new 
eval / applyClosure duo in OCaml.

The option -3- will not appear in the interface I will release.

In order to discriminate between option -1- and options -1- + -2-, could 
you please answer the following question, which I hope falls in the 
scope of legitimate questions on this mailing list:

Suppose I have an OCaml (or pure C if you wish) linked list of OCaml 
value wrapping SEXP values. Is it possible, using only the API, to 
create a LANGSXP / LISTSXP list out of these SEXPs?

I guess this is the crucial point where I hit the limits of the API. 
Please confirm or infirm.

> So again, I just think you're operating on the wrong level here -- and 
> this has nothing to do with the fact that you're binding to a functional 
> language since the mechanisms are the same regardless of the languages 
> (that's why Omegahat was used to bind into any random language that 
> seemed useful).

Will look into Omegahat. Not yet very familiar with R userland.

> You get more headaches since you have to decide how to 
> handle closures both ways, but I suspect the practical solution is to 
> use evaluators on the side where the function is defined (especially for 
> the R side since it includes non-S-language code so you simply cannot 
> map it).

Ok. So suppose I have wrapped an anonymous R closure, called op.

This closure takes two arguments, a string, and yields a float.

I therefore need to write a function "eval_this_op" whose type would be:

eval_this_op : (string -> int) R.t -> string R.t -> int R.t

Essentially, eval_this_op takes three arguments, a wrapped anonymous R 
closure, an R string, and yields an R integer.

How could you write such an eval_this_op function without first solving 
the crucial issue in the above paragraph, which is basically 
constructing a LANGSXP out of an anonymous closure and an R string?

> If you have suggestions for extending the API, feel free to post them 
> with exact explanations how in general that extensions could be useful 
> (general is the key word here - I think so far it was rather to hack 
> around your way of implementing it). [And FWIW tryEval *is* part of the 
> API].

Please take into account that OCaml's type system is extremely strong. 
"My way of implementing it", as you call it, is essentially the most 
natural way to fit in the OCaml paradigm. I must satisfy both OCaml and 
R paradigms in order to write a correct binding.

Please note that it is not an embedding in a random application. It aims 
to be a full blown binding for general purpose. In OCaml, values are 
immutable. Really, really, really immutable. Or they are signals, 
immutable abstractions describing a value that changes overtime. 
Symbols, variables and such are not welcome. References (~pointers) are 
statically typed and *cannot* be type casted. The type checking is so 
strong that you should almost never have to throw an exception. This 
means avoiding dynamic type-checking everywhere it's possible to avoid. 
This means that a function that takes a sexp to yield the underlying 
function should not have to raise an exception if the sexp is not a 
function. It should therefore not have to dynamically typecheck the sexp 
at runtime. This means that you have to enhance the type system to 
*statically* declare (or infer) that this sexp is a LANGSXP. Therefore 
you have to use a polymorphic type system (somehow ~ C++ templated 
types) to say "lang sxp" "list sxp" "sym sxp", etc... You get the idea?

This is not "my way". It's the OCaml way: They like to statically 
type-check *everything* , including HTML. Please have a look at section 
"Static typing of XHTML with XHTML.M" of

	http://ocsigen.org/eliom/manual/1.2.0/1#p1baseprinciples

Do you know why the Swig module for OCaml is virtually unused? Because 
the OCaml community does not consider it type-safe enough. And it will 
go somehow the same for Haskell.

The "general" aspect of my request therefore concerns bindings to 
languages with 'inferred polymorphic static typing'. Please understand 
what these languages are about before dismissing my remarks as "my way". 
You may not care, you wouldn't be the first.

 From Wikipedia: http://en.wikipedia.org/wiki/Objective_Caml

> OCaml's static type system eliminates a large class of programmer errors that may cause problems at runtime. However, it also forces the programmer to conform to the constraints of the type system, which can require careful thought and close attention. A type-inferring compiler greatly reduces the need for manual type annotations (for example, the data type of variables and the signature of functions usually do not need to be explicitly declared, as they do in Java). Nonetheless, effective use of OCaml's type system can require some sophistication on the part of the programmer.

Please understand that I take no joy and no fun in being a pain.

If you force me to write a binding that wouldn't be type safe, it would 
be unused. This is simply not acceptable to me: I am unfortunately not 
willing to waste my time. And will then eventually have to bypass the 
API. Please help me avoid that as much as it is possible with these 
constraints.

-- 
      Guillaume Yziquel
http://yziquel.homelinux.org/