[Rd] bug in partial matching of attribute names

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Feb 14 07:16:10 CET 2007


On Tue, 13 Feb 2007, Tony Plate wrote:

> Ok, thanks for the news of a fix in progress.

BTW, your suggested fix is incorrect.  Consider having an exact match 
after two partial matches in the list of attributes.

> On the topic of the "names" attribute being treated specially, I wonder if 
> the do_attr() function might treat it a little too specially.  As far as I 
> can tell, the loop in the first large block code in do_attr() (attrib.c), 
> which begins
>
>    /* try to find a match among the attributes list */
>    for (alist = ATTRIB(s); alist != R_NilValue; alist = CDR(alist)) {
>
> will find a full or partial match for a "names" attribute (at least for 
> ordinary lists and vectors).
>
> Then the large block of code after that, beginning:
>
>    /* unless a full match has been found, check for a "names" attribute */
>    if (match != FULL && ! strncmp(CHAR(PRINTNAME(R_NamesSymbol)), str, n)) 
> {
>
> seems unnecessary because a names attribute has already been checked for. 
> In the case of a partial match on the "names" attribute this code will 
> behave as though there is an ambiguous partial match, and (incorrectly) 
> return Nil.  Is this second block of code specific to the "names" attribute 
> possibly a hangover from an earlier day when the first loop didn't detect a 
> "names" attribute?  Or am I missing something?  Are there some other objects 
> for which the first loop doesn't include a "names" attribute?

Yes: I pointed you at the 'R internals' manual, but this is also on the 
help page.  1D arrays and pairlists have names stored elsewhere.  It needs 
to be changed to be

-       else if (match == PARTIAL) {
+       else if (match == PARTIAL && strcmp(CHAR(PRINTNAME(tag)), "names")) {


> -- Tony Plate
>
> Prof Brian Ripley wrote:
>> It happens that I was looking at this yesterday (in connection with 
>> encodings on CHARSXPs) and have a fix in testing across CRAN right now.
>> 
>> As for "names", as you will know from reading 'R Internals' the names can 
>> be stored in more than one place, which is why it has to be treated 
>> specially.
>> 
>> On Mon, 12 Feb 2007, Tony Plate wrote:
>> 
>>> There looks to be a bug in do_attr() (src/main/attrib.c): incorrect
>>> partial matches of attribute names can be returned when there are an odd
>>> number of partial matches.
>>> 
>>> E.g.:
>>> 
>>> > x <- c(a=1,b=2)
>>> > attr(x, "abcdef") <- 99
>>> > attr(x, "ab")
>>> [1] 99
>>> > attr(x, "abc") <- 100
>>> > attr(x, "ab") # correctly returns NULL because of ambig partial match
>>> NULL
>>> > attr(x, "abcd") <- 101
>>> > attr(x, "ab") # incorrectly returns non-NULL for ambig partial match
>>> [1] 101
>>> > names(attributes(x))
>>> [1] "names"  "abcdef" "abc"    "abcd"
>>> >
>>> 
>>> The problem in do_attr() looks to be that after match is set to
>>> PARTIAL2, it can be set back to PARTIAL again.  I think a simple fix is
>>> to add a "break" in this block in do_attr():
>>>
>>>     else if (match == PARTIAL) {
>>>     /* this match is partial and we already have a partial match,
>>>        so the query is ambiguous and we return R_NilValue */
>>>     match = PARTIAL2;
>>>     break; /* <---- ADD BREAK HERE */
>>>     } else {
>>> 
>>> However, if this is indeed a bug, would this be a good opportunity to
>>> get rid of partial matching on attribute names -- it was broken anyway
>>> -- so toss it out? :-)  Does anyone depend on partial matching for
>>> attribute names?  My view is that it's one of those things like partial
>>> matching of list and vector element names that seemed like a good idea
>>> at first, but turns out to be more trouble than it's worth.
>>> 
>>> On a related topic, partial matching does not seem to work for the
>>> "names" attribute (which I would regard as a good thing :-).  However,
>>> I'm puzzled why it doesn't work, because the code in do_attr() seems to
>>> try hard to make it work.  Can anybody explain why?
>>> 
>>> E.g.:
>>> > attr(x, "names")
>>> [1] "a" "b"
>>> > attr(x, "nam")
>>> NULL
>>> 
>>> > sessionInfo()
>>> R version 2.4.1 (2006-12-18)
>>> i386-pc-mingw32
>>> 
>>> locale:
>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>> States.1252;LC_MONETARY=English_United
>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>> 
>>> attached base packages:
>>> [1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"
>>> [7] "base"
>>> >
>>> 
>>> -- Tony Plate
>>> 
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>> 
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list