[Rd] R datasets ownership(copyright) and license

Claudia Beleites claudia.beleites at ipht-jena.de
Tue Apr 3 21:03:53 CEST 2012


Yaroslav,

coming from an experimental field, I use options 4 and 4a:

4. I measure the data myself, so I am the copyright holder.
4a. I publish data sets that are given to me in order to publish by the
person(s) who did the measurement. This is properly annotated in the
authors field.

So far, the data sets I put as example data into packages are small
subsets of real studies or data collected in pre-tests, so they are not
that sensitive/valuable. I plan to publish at least one "real" data set
(as own package) eventually. But we're not yet there.

Claudia




Am 03.04.2012 00:06, schrieb Yaroslav Halchenko:
> Dear R Developers,
> 
> Recently filed (and dismissed ;) ) law suit by Astrolabe against tz
> database developers caused a lot of media-press and discussions and
> created some kind of precedence in the USA [3].  But also it imho showed
> that similar attacks might happen in the future, and possibly against
> data sets which are not that obviously "factual" thus after all might
> fall under copyright or IP protection if not in the states then in
> some other jurisdictions.
> 
> And 'data copyright/license' question comes over and over again, I just
> wanted to ask based on  what policies or advisories datasets were
> selected to be shipped with R.   From a very very brief look at the
> datasets, many of them appear to be factual data, thus at least at the
> moment probably are not copyrightable in the states -- but is there
> guarantee that they are not protected by copyright elsewhere if their
> origin abroad?   But some seems to come from published works (still)
> under copyright with "All rights reserved", e.g. datasets Harman23
> and Harman74 [4].
> 
> Although similar question to mine was raised before [e.g. 1,2] I
> have not found a straight answer e.g. from a list above or a mix of
> them:
> 
> 1. we simply did not look into it and adopted them with idea that if
>    someone complains -- we remove corresponding pieces
> 
> 2. we considered all datasets factual data thus not copyrightable (in
>    USA? around the globe?)
> 
> 3. for each (or some or majority) dataset we did collected information
>    on possible copyright+license/IP holder and contacted them where
>    unclear about the permission for reuse in a project under GPL license
> 
> Thank you in advance for the clarification!
> 
> P.S. Please do not take me wrong -- I am not trying to pick at
> anyone.  I just wanted to get a better sense on the
> procedures/assumptions R developers use while adopting data for the R
> package, so that it could be of help for other projects.
> 
> [1] https://stat.ethz.ch/pipermail/r-help/2007-April/130422.html
> [2] http://www.mail-archive.com/r-help@r-project.org/msg62486.html
> [3] http://en.wikipedia.org/wiki/Tz_database
> [4] it is interesting there that actual data comes from "unpublished PhD
>     thesis", but once again from the U of Chicago who holds copyright
>     for the book itself.
> 


-- 
Claudia Beleites
Spectroscopy/Imaging
Institute of Photonic Technology
Albert-Einstein-Str. 9
07745 Jena
Germany

email: claudia.beleites at ipht-jena.de
phone: +49 3641 206-133
fax:   +49 2641 206-399



More information about the R-devel mailing list