[Rd] R license for a derived data-only package

Simon Urbanek simon.urbanek at r-project.org
Fri Sep 16 16:50:17 CEST 2011

On Sep 16, 2011, at 10:32 AM, Michael Friendly wrote:

> I'm looking for guidance or advice about the R license to use in preparing a package containing the
> Baseball Database from http://baseball1.com/statistics/
> My main purpose is to make it available to students in a course, and to develop it with others
> I'd like to put it on R-Forge, and then perhaps make it public on CRAN.
> However, the page above bears a very restrictive copyright notice and limited license:
> This database is copyright 1996-2010 by Sean Lahman. A license is granted
> for individual use for research purposes. It may not be re-distributed
> without permission. Any commercial use, or other dissemination of the
> database in part or in whole is prohibited. Use of this database
> constitutes acceptance of these terms.
> I've written several times to the author asking permission for my intended wider use, but have
> received no reply.
> What makes this perplexing is that I am apparently free to "distribute" this by sending links
> in an email or posting them on a web page, so that others actually download them for
> personal use.  The R package, however would be considered a "derived work", I think,
> since it contains .RData files I created and .Rd documentation.  Does the original
> limited license apply to this?

The way people have dealt with this in the past is to create a package that displays the license and downloads the data. The way I read it (but I am not a lawyer and the wording is very ambiguous) you cannot redistribute it in any form (not even in original form) so the only way to obtain it is to download in from the site. This also implies that the conversion to .RData has to be done at (or after) install time from the download and can't be done at build time.

This does not constitute a legal advice, it is just my personal opinion.


> AFAICS, none of the R licenses described at: http://www.r-project.org/Licenses/
> seem to cover this situation, although they seem to apply to the R package, not the
> data on which it is based.
> The TeX archive CTAN defines a wider range of licenses, including a bunch of non-free ones,
> http://ctan.mirror.rafal.ca/help/Catalogue/licenses.html
> But I don't know if any of these are acceptable in R packages (e.g., will pass R CMD check).
> I'd rather not have to consult a lawyer, so any guidance is welcome.
> -- 
> Michael Friendly     Email: friendly AT yorku DOT ca
> Professor, Psychology Dept.
> York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
> 4700 Keele Street    Web:   http://www.datavis.ca
> Toronto, ONT  M3J 1P3 CANADA
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

More information about the R-devel mailing list