[Rd] bug in partial matching of attribute names

Tony Plate tplate at blackmesacapital.com
Wed Feb 14 16:57:01 CET 2007


Prof Brian Ripley wrote:
> On Tue, 13 Feb 2007, Tony Plate wrote:
> 
>> Ok, thanks for the news of a fix in progress.
> 
> 
> BTW, your suggested fix is incorrect.  Consider having an exact match 
> after two partial matches in the list of attributes.
oops.  yes.

> 
>> On the topic of the "names" attribute being treated specially, I 
>> wonder if the do_attr() function might treat it a little too 
>> specially.  As far as I can tell, the loop in the first large block 
>> code in do_attr() (attrib.c), which begins
>>
>>    /* try to find a match among the attributes list */
>>    for (alist = ATTRIB(s); alist != R_NilValue; alist = CDR(alist)) {
>>
>> will find a full or partial match for a "names" attribute (at least 
>> for ordinary lists and vectors).
>>
>> Then the large block of code after that, beginning:
>>
>>    /* unless a full match has been found, check for a "names" 
>> attribute */
>>    if (match != FULL && ! strncmp(CHAR(PRINTNAME(R_NamesSymbol)), str, 
>> n)) {
>>
>> seems unnecessary because a names attribute has already been checked 
>> for. In the case of a partial match on the "names" attribute this code 
>> will behave as though there is an ambiguous partial match, and 
>> (incorrectly) return Nil.  Is this second block of code specific to 
>> the "names" attribute possibly a hangover from an earlier day when the 
>> first loop didn't detect a "names" attribute?  Or am I missing 
>> something?  Are there some other objects for which the first loop 
>> doesn't include a "names" attribute?
> 
> 
> Yes: I pointed you at the 'R internals' manual, but this is also on the 
> help page.  1D arrays and pairlists have names stored elsewhere.  It 
> needs to be changed to be
> 
> -       else if (match == PARTIAL) {
> +       else if (match == PARTIAL && strcmp(CHAR(PRINTNAME(tag)), 
> "names")) {

Wouldn't it be equivalent (but clearer & a minute fraction more 
computationally efficient) to put that test 
'strcmp(CHAR(PRINTNAME(tag)), "names")' right at the start of that 
block, i.e., as an additional condition in

     if (match != FULL && ! strncmp(CHAR(PRINTNAME(R_NamesSymbol)), str, 
n)) {

?

-- Tony Plate

> 
> 
>> -- Tony Plate
>>
>> Prof Brian Ripley wrote:
>>
>>> It happens that I was looking at this yesterday (in connection with 
>>> encodings on CHARSXPs) and have a fix in testing across CRAN right now.
>>>
>>> As for "names", as you will know from reading 'R Internals' the names 
>>> can be stored in more than one place, which is why it has to be 
>>> treated specially.
>>>
>>> On Mon, 12 Feb 2007, Tony Plate wrote:
>>>
>>>> There looks to be a bug in do_attr() (src/main/attrib.c): incorrect
>>>> partial matches of attribute names can be returned when there are an 
>>>> odd
>>>> number of partial matches.
>>>>
>>>> E.g.:
>>>>
>>>> > x <- c(a=1,b=2)
>>>> > attr(x, "abcdef") <- 99
>>>> > attr(x, "ab")
>>>> [1] 99
>>>> > attr(x, "abc") <- 100
>>>> > attr(x, "ab") # correctly returns NULL because of ambig partial match
>>>> NULL
>>>> > attr(x, "abcd") <- 101
>>>> > attr(x, "ab") # incorrectly returns non-NULL for ambig partial match
>>>> [1] 101
>>>> > names(attributes(x))
>>>> [1] "names"  "abcdef" "abc"    "abcd"
>>>> >
>>>>
>>>> The problem in do_attr() looks to be that after match is set to
>>>> PARTIAL2, it can be set back to PARTIAL again.  I think a simple fix is
>>>> to add a "break" in this block in do_attr():
>>>>
>>>>     else if (match == PARTIAL) {
>>>>     /* this match is partial and we already have a partial match,
>>>>        so the query is ambiguous and we return R_NilValue */
>>>>     match = PARTIAL2;
>>>>     break; /* <---- ADD BREAK HERE */
>>>>     } else {
>>>>
>>>> However, if this is indeed a bug, would this be a good opportunity to
>>>> get rid of partial matching on attribute names -- it was broken anyway
>>>> -- so toss it out? :-)  Does anyone depend on partial matching for
>>>> attribute names?  My view is that it's one of those things like partial
>>>> matching of list and vector element names that seemed like a good idea
>>>> at first, but turns out to be more trouble than it's worth.
>>>>
>>>> On a related topic, partial matching does not seem to work for the
>>>> "names" attribute (which I would regard as a good thing :-).  However,
>>>> I'm puzzled why it doesn't work, because the code in do_attr() seems to
>>>> try hard to make it work.  Can anybody explain why?
>>>>
>>>> E.g.:
>>>> > attr(x, "names")
>>>> [1] "a" "b"
>>>> > attr(x, "nam")
>>>> NULL
>>>>
>>>> > sessionInfo()
>>>> R version 2.4.1 (2006-12-18)
>>>> i386-pc-mingw32
>>>>
>>>> locale:
>>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>>> States.1252;LC_MONETARY=English_United
>>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>>
>>>> attached base packages:
>>>> [1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  
>>>> "methods"
>>>> [7] "base"
>>>> >
>>>>
>>>> -- Tony Plate
>>>>
>>>> ______________________________________________
>>>> R-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>
>>
>



More information about the R-devel mailing list