[Rd] parse( connection) and source-keeping

Duncan Murdoch murdoch.duncan at gmail.com
Thu Jan 12 14:57:00 CET 2012

On 11/01/2012 8:36 PM, Duncan Murdoch wrote:
> On 12-01-11 3:54 PM, Mark.Bravington at csiro.au wrote:
> >  In R<= 2.13.x, calling 'parse( con)' where 'con' is a connection, 'options( keep.source)' is TRUE,  and default 'srcfile' would preserve the source. In R>= 2.14.1, it doesn't.
> Actually, it preserved the "source" attribute of the function if it
> could, but didn't add a srcref.  Sometimes it would fail, giving a
> message like
> Error in parse(textConnection(texto)) :
>     function is too long to keep source (at line 8812)
> >
> >>  tf<- tempfile()
> >>  options( keep.source=TRUE)
> >>  texto<- c( 'function() { # comment', '}')
> >>  parse( text=texto)
> >  expression(function() { # comment
> >  })
> >>  cat( texto, file=tf, sep='\n')
> >>  parse( file=tf)
> >  expression(function() { # comment
> >  })
> >>  parse( file( tf))
> >  expression(function() {
> >  })
> >>  parse( textConnection( texto))
> >  expression(function() {
> >  })
> >
> >  and yes I didn't bother closing any connections.
> >
> >  My suspicion is that this change is unintentional, and it seems to me that the best option would be for 'connection' to work like 'text' does here, ie to attach a 'srcfilecopy' containing the contents.
> Yes, that does sound like a good idea.

I've taken a look, and this doesn't look like something I'll fix for 
2.15.0.  Here's why:

The entry points to the parser are really quite a mess, and need to be 
cleaned up:  working around this problem without that cleanup would make 
them messier, and I don't have time for the cleanup before 2.15.0.

Part of the problem is that connections are so flexible:  the parser 
doesn't know whether the connection passed to it is at the beginning, or 
whether you've already read some lines from it; it might not even have a 
beginning (e.g. stdin()).

There is a relatively easy workaround if you really need this:   you can 
make the srcfilecopy yourself, and pass it as the "srcfile" argument to 
parse.  (This won't work on the stdin() connection, but if you're the 
one creating the connection, you can perhaps work around that.  By the 
time parse() is called, it's too late.)

The parser does manage to handle input coming from the console, because 
that case uses a different entry point to the parser, and it keeps a 
buffer of all input.  (It needs to do this because you might not be 
finished typing yet, and it will start over again when you enter the 
next line.)  So connections (even stdin()) could be handled in the same 
way, and when I do the big cleanup, that's probably what will happen.

If you have a particular use case in mind and the workaround above isn't 
sufficient, let me know.

Duncan Murdoch

More information about the R-devel mailing list