[Rd] Recursively parsing srcrefs

Duncan Murdoch murdoch.duncan at gmail.com
Fri May 13 01:50:31 CEST 2011


On 12/05/2011 3:59 PM, Hadley Wickham wrote:
>>> Is it possible to "recursively" parse srcrefs to match the recursive
>>> structure of the underlying code?  I'm interested in this because it's
>>
>> I don't understand what you mean by that.  It is certainly possible to walk
>> through nested srcrefs, to zoom in on a particular location; that's what
>> findLineNum() does.
>
> Does the example below not help?  Given the whole function, I want to
> be able to walk down the call tree, finding the matching src refs as I
> go.  i.e. given f, how do I get f_inside?
>
> f<- function(x = T) {
>    # This is a comment
>    if (x)                  return(4)
>    if (emergency_status()) return(T)
> }
>
> f_inside<- parse(text = "
>    # This is a comment
>    if (x)                  return(4)
>    if (emergency_status()) return(T)
> ")

I don't think you will get exactly the same thing.  The problem is that 
srcrefs are attributes, and not all statements can have attributes 
attached to them, so the attributes are attached to the container of the 
statements.  (For example, NULL is a statement and it is stored as the 
NULL object, but the NULL object can't have any attributes on it.) In 
those two cases, the containers are different, so I would expect some 
differences between what's in f and what's in f_inside (though I expect 
you could convert one to the other).


>
> findLineNum doesn't quite do what I want - it works on the text of the
> srcref, not on the parse tree.

It searches through the parse tree for the smallest source ref that 
contains a given line.  So for example,

if(condition) {
   blah
   blah
   blah
}

is a single statement, and there will be a srcref stored in its 
container that goes from line N to line N+4.  But it also contains the 
compound statement

{
   blah
   blah
   blah
}

and there will be srcrefs attached to that for each of the statements in 
it.  (I forget right now whether there are 3 or 4 statements there:  R 
treats braces in a funny way, and I'd have to look at an example to 
check.)  Each of the "blah"'s will get a srcref spanning one line, and 
it will be stored in the container.

>
> Here's another go at explaining what I want:
>
> h<- quote(
>    1 # one
>    + # plus
>    2 # two
> )
>
> h[[1]] extracts +.  What can I do to extract "+ # plus" (on an object
> created in the appropriate manner to keep the srcref)?  Is that even
> possible?

You can't easily do that.  The current parser only attaches srcrefs down 
to the statement level, and the + is part of the statement which parses 
to "+ 2".  (The 1 is a separate statement.)

Duncan Murdoch

>
> My eventual goal is something like
>
> f<- function(x) {
>    # This is my function
>    T
> }
>
> g<- fix_logical_abbreviations(f)
>
> which would be equivalent to
>
> g<- function(x) {
>    # This is my function
>    TRUE
> }
>
>
>> That last display looks like a bug indeed.  I'll take a look.
>
> The key seems to be a leading newline:
>
> parse(text = "\nx")
> parse(text = "x")
>
> Hadley
>
>



More information about the R-devel mailing list