[Rd] citEntry handling of encoded URLs

Achim Zeileis Achim.Zeileis at uibk.ac.at
Sat May 24 10:30:59 CEST 2014


On Fri, 23 May 2014, Duncan Murdoch wrote:

> On 23/05/2014 8:35 AM, Achim Zeileis wrote:
>> On Thu, 22 May 2014, Martin Morgan wrote:
>> 
>> > The following citEntry includes a url with %3A and other encodings
>> >
>> > citEntry(entry="article",
>> >         title = "Software for Computing and Annotating Genomic Ranges",
>> >         author = personList( as.person("Michael Lawrence" )),
>> >         year = 2013,
>> >         journal = "{PLoS} Computational Biology",
>> >         volume = "9",
>> >         issue = "8",
>> >         doi = "10.1371/journal.pcbi.1003118",
>> >         url =
>> > 
>> "http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118",
>> >         textVersion = "Lawrence M..." )
>> >
>> > Evaluating this as R code doesn't parse correctly and generates a warning
>> 
>> The citEntry (or bibentry) itself is parsed without problem. Some printing
>> styles cause the warning, specifically when the Rd parser is used for
>> formatting. Depending on how you want to print it, the warning doesn't
>> occur though. Using bibentry() directly, we can do:
>> 
>> b <- bibentry("Article",
>>     title = "Software for Computing and Annotating Genomic Ranges",
>>     author = "Michael Lawrence and others",
>>     year = "2013",
>>     journal = "PLoS Comptuational Biology",
>>     volume = "9",
>>     number = "8",
>>     doi = "10.1371/journal.pcbi.1003118",
>>     url = 
>> "http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118",
>>     textVersion = "Lawrence M et al. (2013) ..."
>> )
>> 
>> Then the default
>> 
>> print(b)
>> 
>> issues a warning because the Rd parser thinks that the % are comments.
>> However,
>> 
>> print(b, style = "BibTeX")
>> print(b, style = "citation")
>> 
>> don't issue warnings and also produce output that one might expect.
>> 
>> > A work-around is, apparently, to quote the %, \\%3A etc., but is this the
>> > intention?
>> 
>> In that case the default print(b) yields the desired output without
>> warning but print(b, style = "BibTeX") or print(b, style = "citation") are
>> possibly not in the desired format. I'm not sure though how the different
>> BibTeX style files actually handle the URLs. I think some .bst files
>> handle the "url" field verbatim (i.e., don't need escaping) while others
>> treat it as text (i.e., need escaping). Personally, I would hence avoid
>> the problem and only use the DOI URL here as this will be robust across
>> BibTeX styles.
>> 
>> Nevertheless it is not ideal that there is a discrepancy between the
>> different printing styles. I think currently this can only be avoided if
>> custom macros are employed. But Duncan might be able to say more about
>> this. A similar situation occurs if you use commands that are not part of
>> the Rd markup, e.g.
>
> I'd go further than "not ideal", I think we need to define what kind of 
> markup is permissible in this context.  If it needs to be Rd markup, 
> then the default print method should be fixed to hide it (and \mathcal 
> should not be allowed); if it needs to be plain text, then some escaping 
> should be done.

I would argue that any LaTeX-style markup should be permitted in the 
bibentry objects so that you can work with your BibTeX files in R. For the 
"text" and "html" print output, I would be happy if these used some 
approximation for unknown markup, e.g., omitting \mathcal and $ or 
something like that. Then only a small subset of LaTeX-style Rd markup 
commands would be properly processed.

>> n01 <- bibentry("Misc", title = "The $\\mathcal{N}(0, 1)$ Distribution",
>>     author = "Foo Bar", year = "2014")
>> print(n01) # warning
>> print(n01, style = "BibTeX") # ok
>> 
>> > Also, citEntry points to bibentry points to *Entry Fields*, but the
>> > 'url' tag is not mentioned there, even though url appears in the
>> > examples; if the list of supported tags is not easy to enumerate,
>> > perhaps some insight can be provided at this point as to how the
>> > supported tags are determined?
>> 
>> This follows the BibTeX conventions. Thus, you can use any tag that you
>> wish to use and it will depend on the style whether it is displayed or
>> not. The only restriction is that certain bibtypes require certain
>> fields, e.g., an "Article" has to specify: author, title, journal, year.
>> But beyond that you can add any additional field. For example, in your
>> bibentry above you used the "issue" field which is ignored by most BibTeX
>> styles. My adaptation uses the "number" field instead which is processed
>> by most standard BibTeX styles.
>> 
>> The default print(..., style = "text") uses a bibstyle that is modeled
>> after jss.bst, the BibTeX style employed by the Journal of Statistical
>> Software. But you could plug in other .bibstyle arguments, e.g. one that
>> processes the "issue" field etc.
>> 
>> Hope that helps,
>> Z
>> 
>> > Thanks
>> >
>> > Martin Morgan
>> > --
>> > Computational Biology / Fred Hutchinson Cancer Research Center
>> > 1100 Fairview Ave. N.
>> > PO Box 19024 Seattle, WA 98109
>> >
>> > Location: Arnold Building M1 B861
>> > Phone: (206) 667-2793
>> >
>> > ______________________________________________
>> > R-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>> 
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



More information about the R-devel mailing list