Interrupts (was Re: [Rd] X11 protocol errors ...)

Thu, 23 Aug 2001 13:37:16 -0400 (EDT)

On Thu, 23 Aug 2001, Luke Tierney wrote:

> I'm talking about something related but different: controlling the
> point at which an asynchronous signal is brought into the system (and
> turned into an exception if we have a proper exception system.)  R
> currently has on.exit, and Robert Gentleman and I proposed a more
> structured exception mechanism for possible addition to R in the neas
> future.
> 
> [I sent a posting about the proposed mechanism a while back.  So far
> we have received little feedback, so here is another request: Please
> have a look at http://www.stat.umn.edu/~luke/R/exceptions/simpcond.html
> and let us know if you have any comments/suggestions]
> 
> But that is not the issue here.  The issue is whether we allow a
> SIGINT signal in UNIX (and whatever its analog is on other systems) to
> interrupt the current calculation immediately, no matter where it
> might be, or whether we impose more structure.  Windows/Mac pretty
> much force more structure at the C level, since their analogs have to
> arrive through mechnisms that require explicit polling.  So on Windows
> you know that an expression like
> 
> 	x = malloc(n)
> 
> will not get interrupted between the malloc call and the assignment to
> x (unless some very low level tricks are involved).  On UNIX, the
> signal can arrive in between those two operations.

That sounds like two problems--- the first being how to make sure that the
allocation and assignment happen as an atomic operation (that's not
difficult--- we just throw a critical section around it) but also in the
event of an interrupt, what to do with things like allocated memory in the
case of an interrupt (which was Duncan's point in the earlier message that
got snipped). Unfortunately, I think the transactional (is that a word?) 
database people may be the only ones with a good handle on that particular
problem and thats expensive... Though I suppose if you're wrapping
everything in an environment and interrupt could just ensure the
destruction of the environment, but it doesn't handle global variables.
You'd need some sort of stack on each variable that kept weak references
to previous values (since R doesn't usually access things by reference
unless you force it, right?) until they're garbage collected at which
point their weak references will also be removed from the stack--- "Undo
Capability, Limited Time Only!" :-)

> 
> The safe thing to do on UNIX is to have the signal handler just set a
> flag which is then checked at appropriate points.  This is the
> approach that John Eaton mentioned, and is used by most Scheme systems
> I've looked at.  I suspect Python and Perl do this as well, but I'll
> have to check.  This is also the way Java handles thread interrupts.
> It would make the UNIX behavior identical to the WIndows behavior.
> 
> The drawback for systems like R and Octave is that we rely on being
> able to use chunks of C/Fortran that can potentially run for a long
> time (forever if they happen to get into infinite loops occasionally)
> and where it is either impractical or impossible to insert flag
> checking code.  For those situations it is nice to be able to use a
> signal handler to force a jump out of that code.  We live without this
> ability on Windows/Mac, and don't do too badly there, but it would be
> nice not to compltely loose this facility on UNIX. Most numerical code
> tends to not behave too badly when exited by a longjmp, but there are
> no guarantees.  For example, if a piece of C code does something like this:
> 
> 	static inited = FALSE;
> 	if (! inited) {
> 	    inited = TRUE;
> 	    ... initialize a table needed for computations ...
>         }
>         ... use the table ..
> 
> and a Control-C arrives in the first call after inited=TRUE is executad
> but before the table is fully initialized, then future calls to this
> function will happily return nonsense.

Step 1. Have anyone who uses statics burned at the stake for not writing
threadsafe code ;-) 

> 
> One option would be to tag routines at library regestration time as
> safe for LONGJMP's or not.  That way we can disable LONGJUMP
> interrupts everywhere except in explicitly marked .C or .Fortran calls
> (and blocking IO operations). This will insure that no internal R
> state gets messed up by asynchronous signals that arrive at on
> inopportune time.

I think you'll want finer grain control as well--- my guess would be to
have external calls execute within a critical section (like Java, if my
understanding is correct) unless explicitly marked otherwise but still
allow for a critical section to be entered (and left) within the code
block. An example would be something like reading from a URL where I may
spend some time blocked waiting for a connection where you would want the
ability to break out without having to wait for a timeout, but if for some
reason once the transfer is started it must be allowed to complete I would
want to engage the critical section later in the function. 

> 
> But this only addresses the C level.  On Windows/Mac, the place where
> a user break is turned into an R exception is (mainly) in the internal
> eval, where every 1000 calculatins (or some such number) the flag is
> checked and a jump is done if the flag is set.  UNIX would work the
> same way.  Since the internals know exactly where this jump can occur,
> unlike jumps out of a signal handler, they can make sure all internal
> state is consistent before checking the flag.
> 
> >From the R level things look different: the 1000'th eval step can
> happen anywhere, so a piece of R code that does
> 
> 	file <- file(file, "w")
>         on.exit(close(file))
>         ... do something with file ...
> 
> has a race condition: an interrupt that arrives between the creation
> of the file and the registration of the on.exit handler will leave
> the file open.  Something along the lines of
> 
> 	without.interrupts({
>             file <- file(file, "w")
>             on.exit(close(file))
>             with.interrupts(... do something with file ...)
>         })
> 
> would be safe but is too awkward in this form. [Using a structured
> exception handling mechanism, some sort of try/finally construct,
> would make this code cleaner but would not resolve the race
> condition.]
> 
> There are no easy solutions I think, but we need to look at a range of
> options and see what works best.
> 
> [Threads add the additional problem that an interrupted thread might
> be holding a lock, and failure to release the lock could cause
> deadlock.  Using a structured exception handling mechanism to manage
> lock release helps, but race conditions are still potentially an issue
> with asynchronous interrupts.]

I'm just sort of pulling stuff out of thin air and I don't expect this
stuff to be easy to implement, but here goes: :-)

Say, we have an ideal world where R executes in a bytecode VM like Java or
something else---I've noticed that this idea pops up every now and then
for performance reasons. In this case, why not just take it one step
further and have the R environment actually be something of a lightweight
operating system (they don't have to be bloaty and the VM's nature means
it can be fairly abstracted---no need for 'device drivers' in the
traditional sense and whatnot) that manages each of R's user-level threads
as a distinct process. The 'OS' then handles the context switching and
preemption that we'll need anyway but also handles interrupt cases by
trapping them from the operating system (using signal handlers or the
particular OS's analogue). The interrupt handler would then be able to
forcibly shutdown whatever shared resources and memory allocated to the
'process' in the same way it happens in a real OS (this would obviously
require that the I/O system be abstracted away in internal and C calls---
but I think we've already got a good start on that with the connection
mechanism and we want it for other reasons as well). My thought is that
this sort of set up would also change the flavour of the native threading
as well since it actually becomes more analogous to developing an SMP
operating system, which people already know how to do, though you would
want to keep the 'OS' bits to an absolute minimum more like an RTOS than a
UNIX or something like that (read: primitive :-)). 

> 
> luke
> 
> -- 
> Luke Tierney
> University of Minnesota                      Phone:           612-625-7843
> School of Statistics                         Fax:             612-624-8868
> 313 Ford Hall, 224 Church St. S.E.           email:      luke@stat.umn.edu
> Minneapolis, MN 55455 USA                    WWW:  http://www.stat.umn.edu
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> 

Byron Ellis (bellis@hsph.harvard.edu)
"Oook" - The Librarian

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._