[Rd] Citation of R packages

Friedrich.Leisch@tuwien.ac.at Friedrich.Leisch at tuwien.ac.at
Fri Feb 10 19:36:28 CET 2006

>>>>> On Fri, 10 Feb 2006 21:01:44 +1100,
>>>>> John Maindonald (JM) wrote:


  > Where there is a published paper or a book (such as MASS), or a
  > manual for which a url can be given, my decision was to include
  > that in the main list of references, but not to include references
  > there that were references to the package itself, which as you
  > suggest below can be a reference to the concatenated help pages.

The CITATION file of a package may contain as many entries as the
author wants, including both a reference to the help pages and to the
book (or whatever).

  > It seemed anyway useful to have a separate list of packages.  For
  > consistency, these were always references to the package, with a
  > cross-reference to any relevant document in the references to papers.

  >>> (2) Maybe the author field should be more nuanced, or
  >>> maybe ...
  >> author fields of bibtex entries have a strict format (names separated
  >> by "and"), what do you mean by "more nuanced"?

  > Those named in the list of authors may be any combination of: the  
  > authors
  > of an R package, the authors of an original S version, the person or  
  > persons
  > responsible for an R port, the authors of the Fortran code, compiler 
  > (s), and
  > contributors of ideas.

  > For John Fox's car, citation() gives the following:
  >      author = {John Fox. I am grateful to Douglas Bates and David  
  > Firth and Michael Friendly and Gregor Gorjanc and Georges Monette and  
  > Henric Nilsson and Brian Ripley and Sanford Weisberg and and Achim  
  > Zeleis for various suggestions and contributions.},

  > For Rcmdr:
  >      author = {John Fox and with contributions from Michael Ash and  
  > Philippe Grosjean and Martin Maechler and Dan Putler and and Peter  
  > Wolf.},

  > For car, maybe John Fox should be identified as author.  For Rcmdr,  
  > maybe the other persons that are named should be added?

  > For leaps:
  >      author = {Thomas Lumley using Fortran code by Alan Miller},

  > It seems reasonable to cite Lumley and Miller as authors.  Should  
  > there be a note that identifies Miller as the contributor of the  
  > Fortran code?

  > Should the name(s) of porters (usually from S) be included as author 
  > (s)?  Or should their contribution be acknowledged in the note field?  
  > Or ...

  > Possibilities are to cite all those individuals as author, or to cite  
  > John Fox only,
  > with any combination of no additional information in the note field,  
  > or using the
  > note field to explain who did what.  The citation() function leaves  
  > it unclear who
  > are to be acknowledged as authors, and in fact

Umm, the problem there is not the citation() function, but that the
authors of all those packages obviously have not included a CITATION
file in their package which overrides the default (extracted from the

E.g., package flexclust has DESCRIPTION

Package: flexclust
Version: 0.8-1
Date: 2006-01-11
Author: Friedrich Leisch, parts based on code by Evgenia Dimitriadou


R> citation("flexclust")

To cite package flexclust in publications use:

  Friedrich Leisch. A Toolbox for K-Centroids Cluster Analysis.
  Computational Statistics and Data Analysis, 2006. Accepted for

A BibTeX entry for LaTeX users is

    author = {Friedrich Leisch},
    title = {A Toolbox for K-Centroids Cluster Analysis},
    journal = {Computational Statistics and Data Analysis},
    year = {2006},
    note = {Accepted for publication},

because the CITATION file overrides the DESCRIPTION file. Writing a
CITATION file is of course also intended for those cases where a
proper reference cannot be auto-generated from the DESCRIPTION file.

  >>> (3) In compiling a list of packages, name order seems
  >>> preferable, and one wants the title first (achieved by
  >>> relocating the format.title field in the manual FUNCTION
  >>> in the .bst file
  >>> (4) manual seems not an ideal name for the class, if
  >>> there is no manual.
  >> A package always has a "reference manual", the concatenated help pages
  >> certainly qualify as such and can be downloaded in PDF format from
  >> CRAN. The ISBN rules even allow to assign an ISBN number to the online
  >> help of a software package which also can serve as the ISBN number of
  >> the *software itself* (which we did for base R).

  > I'd prefer some consistency in the way that R packages are referenced.
  > Thus, if reference for one package is to the concatenated help pages,
  > do it that way for all of them.

But we recommend that package authors should (try to) get their work
into reviewed journals like JSS, JCGS, or CSDA, and then package
authors usually prefer if the article gets cited. Unfortunately, many
academic institutions value paper publications higher than software.
Citing the help pages is mainly intended as a substitute if no journal
article is available.


More information about the R-devel mailing list