[Rd] on ptr_R_WriteErrConsole (was: Re: RFC: API to allow identification of warnings and errors in the output stream)

Tue Oct 18 12:06:59 CEST 2005

> I second the point about not having made a case, and also that (from the
> part deleted below) about needing to understand the internals before
> proposing changes.

I'll try to make the case below point d). I fully accept rejection of the more 
daring part of my proposal, but the part about ptr_R_WriteErrConsole is 
really straightforward, and I have really made sure to understand the 
internal going ons, here.
My reply will be lengthy, because this really is important to me.

> a) I took a quick look at the patch, and it seems to be only half the
> story.  R_WriteConsole is also supported under Windows, in a different
> (and more flexible) way.  I think the part not covered is a lot more
> problematic, as I don't see how to introduce R_WriteErrConsole in a
> backwards compatible way.

I admit I've been neglectant to the Windows code. I don't have a windows 
machine to compile and test any changes I make in my local sources, and that 
lead me to be careless. so probably the patch is not correct.
Do I believe there is still a backwards compatible way to introduce 
ptr_R_WriteErrConsole / R_WriteErrConsole?

void R_WriteErrConsole(char *buf, int len) {ptr_R_WriteErrConsole(buf, len);}
...
/* standard initialization: */
ptr_R_WriteErrConsole = R_WriteConsole;

*R_WriteConsole*. This ensures that if you do not touch the new 
ptr_R_WriteErrConsole, you get *exactly* the old behavior. I can't be much 
easier than that.

> b) There are several inaccuracies in the account.  First, `channels' are
> not what we are talking about here.  sink() allow output and messages to
> be sent to different connections, but the console is connected to fixed
> terminal connections stdout() and stderr().  There is a long-standing
> design for R consoles (sketched in src/unix/system.txt) that has all the
> output going to the console.  This is even true of the standard Unix
> terminal front end.  The only circumstances in which stdout() and stderr()
> are separated are when R is used for scripting.  Part of the motivation
> for this is to ensure that stdout() and stderr() appear in an integrated
> way (in particular, interleaved in the correct order) on a console.

Ok, so I have been simplifying, and using my own terminology. What I'm saying 
is, output generally goes two routes *throughout* R sources:

"message":
[warnings]->...->REprintf->REvprintf
[errors]->...->REprintf->REvprintf
REprintf->REvprintf

"output":
[print]->...->Rprintf->Rvprintf
Rprintf->Rvprintf

So it is not like I'm making up any new distinction, here. I'm referring to an 
existing one, avaible via already public API. See also this comment from 
printutils.c:

/* =========
 * Printing:
 * =========
 *
 * All printing in R is done via the functions Rprintf and REprintf
 * or their (v) versions Rvprintf and REvprintf.
 * These routines work exactly like (v)printf(3).  Rprintf writes to
 * ``standard output''.	 It is redirected by the sink() function,
 * and is suitable for ordinary output.	 REprintf writes to
 * ``standard error'' and is useful for error messages and warnings.
 * It is not redirected by sink().

All I'm dealing with is, what should happen in REvprintf, the last station in 
what I call the "message route":

Currently it is basically (and yes, I'm simplifying again):
void REvprintf(const char *format, va_list arg)
{
	/* A: if there is a sink(type="message"), print to that and return */ 
	/* B: if RConsoleFile is not NULL, print to that and return */ 
	/* C: if none of the above, call R_WriteConsole */ 
}

All I'm talking about is case C, and effectively, my proposal would change 
this to
	/* C: ptr_R_WriteErrConsole has been modified by stubborn developer, print to 
that and return */ 
	/* D: if none of the above, call R_WriteConsole */ 

> No other console designer has seen a need for this, and there is at least
> the potential for presenting very misleading information to end users.

Let me assure you this: The interface pointers give me enough rope already to 
shoot myself in the foot. I have enough opportunity to present extremely 
misleading information to end users. Re-read my answer to a) to see the 
change really, really, really would not affect any code that does not 
explicitely ask for this extra inch of rope.
Also let me assure you this: Other GUI designers are much interested in an 
easy way to find out which portions of the output have come the "message" 
route, as well. We do crazy stuff like using separate file-sinks, grepping 
for "warning", or "error", and things like that. I've seen hacks and hacks, 
and I've used hack after hack. But I think there is a good opportunity for a 
solid solution.

> So 
> one ramification of allowing consoles to present stdout() and stderr()
> separately would have to be a review of how they are used.  This is not
> hypothetical: I have struggled many times over the years with an R-like
> system which when scripted wrote error messages in inappropriate places on
> a file, at one point sending prompts to stderr yet echoed input to stdout.
> We've worked hard in R to avoid that kind of thing, mainly by having a
> single route to the console.  (There are still residual problems if C++ or
> Fortran I/O is used, BTW, and note that R_ReadConsole also *writes* to the
> console.)

And the single route to the console will remain intact using R_WriteConsole. 
I'm only asking for the *opportunity*, not the *obligation* to intercept the 
one call to R_WriteConsole in REvprintf.

> c) Anything in R involving more than one of the three main families of
> platforms is NOT a `small addition': it involves testing and subsequent
> maintenance by two or three people.

So, yes, somebody would have to add corresponding code to windows. 
Unfortunately, I don't think I qualify for this, as I just can't test on 
windows. I'm fairly confident, the change I did in gnuwin32 will ensure 
nothing is broken, but you would want a parallel to ptr_R_WriteErrConsole in 
windows for consistency's sake.
But please: Don't conjure up a maintenance nightmare for this simple change.

> d) There is an existing mechanism that could be used.  If you want
> file-like stderr and stdout, you could drive R via a file-like interface
> (e.g. ptys).  That is not easy on non-Unix-alike platforms (and was
> probably impossible on classic MacOS R), but I understood Thomas was using
> KDE.  (There are live examples of this route, even on Windows.)

And it's not like I haven't ventured along that route. Do you know how much 
fun it is to use two separate ptys, then try to make sure the output arrives 
in a sensible order, i.e. you don't get all warnings before all output or 
vice versa, or some strange mixture? I'm not the infailable programmer, and 
I'm not an expert in R internals. But before I go into lengthy discussions, I 
have checked my options.

So again: Why do I want something like this?
A GUI may want to do some things, which are not needed on the console. One 
such thing is to identify "message" output. There are several uses for this:
1) Highlight "message" output to bring it to the users attention
2) Offer context help on "message"s. Of course this is easier said that done, 
but the first step in this, would be to find out which portions of the output 
are messages.
3) Show "message"s that come up in operations that would usually redirect the 
output elsewhere, and not show it to the user directly. Much like the 
scripting situation you depicted in b)

And again: Why can't I just use current facilities?
1) Condition Handling: Probably the way to go for my advanced needs. But 
totally useless, if I want to catch messages generated using direct calls to 
REprintf/REvprintf. Those are abundant. They have to be dealt with. The only 
way to do this is to use a mechanism available in/below/after REvprintf.
2) Ptys, sinks: See above
3) Grepping: come on now. Could not even solve most needs, impossible due to 
internationalization...
4) Using mind reading? When I first posted about these matters to r-devel (and 
yes, before that I tried my luck on r-help), it was a plain support question: 
How can I do this? See also here:
https://stat.ethz.ch/pipermail/r-devel/2005-October/034975.html
I did not receive any reply on this. What should I conclude?

I'll gladly accept alternative solutions. They are not the above though. And 
I've written why they are not before.

> I have spent far longer (and written more) than I intended on this.  The
> length of correspondence so far (and much in a prolix style) is all part
> of the support costs.  One thing the R project can not afford is to
> explain to individual users how internals work -- we have not even been
> able to find the time to write down for the core team how a lot of the
> internals work, and some developments are being held up by that.  So this
> has to weighed against considering proposals which would appear to help
> just one user.

Sorry about writing more and more lengthy mails. I don't really want to. I 
have better things to do as well. But this is important, and - sorry to say - 
IMO you've simply overlooked a number of points. All I can do is restate 
them, trying to make extra sure to get my point across.

> I suspect that we will only want to go forward if a concise and strong
> case can be made from a group of developers who have tested it out on all
> three main families of platforms.

And where would you think, I could conjure such a group of developers, if not 
on this list?

Regards
Thomas Friedrichsmeier