[R] [OT] Is data copyrightable?

hadley wickham h.wickham at gmail.com
Sun May 13 11:44:25 CEST 2007

These links from the US copyright office seem relevant:

"Copyright Registration for Automated Databases"


"Furthermore, copyright protection does not extend to works consisting
entirely of information that is common property containing no original
authorship, for example: standard calendars, height and weight charts,
tape measures and rulers, schedules of sporting events, and lists or
tables taken from public documents or other common sources."
from http://www.copyright.gov/circs/circ32.html

and also

"Notwithstanding the provisions of sections 106 and 106A, the fair use
of a copyrighted work, including such use by reproduction in copies or
phonorecords or by any other means specified by that section, for
purposes such as criticism, comment, news reporting, teaching
(including multiple copies for classroom use), scholarship, or
research, is not an infringement of copyright. In determining whether
the use made of a work in any particular case is a fair use the
factors to be considered shall include —

(1) the purpose and character of the use, including whether such use
is of a commercial nature or is for nonprofit educational purposes;

(2) the nature of the copyrighted work;

(3) the amount and substantiality of the portion used in relation to
the copyrighted work as a whole; and

(4) the effect of the use upon the potential market for or value of
the copyrighted work.

The fact that a work is unpublished shall not itself bar a finding of
fair use if such finding is made upon consideration of all the above
from http://www.copyright.gov/title17/92chap1.html#102

and at Stanford:
"A fact or a theory--for example, the fact that a comet will pass by
the Earth in 2027 --is not protected by copyright. If a scientist
discovered this fact, anyone would be free to use it without asking
for permission from the scientist. Similarly, if someone creates a
theory that the comet can be destroyed by a nuclear device, anyone
could use that theory to create a book or movie. However, the unique
manner in which a fact is expressed may be protected. Therefore, if a
filmmaker created a movie about destroying a comet with a nuclear
device, the specific way he presented the ideas in the movie would be
protected by copyright.

EXAMPLE: Neil Young wrote a song, "Ohio," about the shooting of four
college students during the Vietnam War. You are free to use the facts
surrounding the shooting but you may not copy Mr. Young's unique
expression of these facts without his permission.

In some cases, you are not free to copy a collection of facts because
the collection of facts may be protectible as a compilation (see
Section B5). For more information on how copyright applies to facts,
refer to Chapter 2, Section F3."


On 5/13/07, hadley wickham <h.wickham at gmail.com> wrote:
> Dear Brian, Peter, Spencer,
> Thanks for your comments, which have cleared things up a little for
> me.  The thing I find most confusing about copyright is that it is
> emergent, not atomic - ie. if you split a copyrighted work into small
> enough pieces (eg. letters, pixels) those pieces are no longer
> copyrightable.  It is the combination of those small pieces into a
> specific form that is important, and the definition of derivative
> works seems to help define what rearrangement of those pieces is still
> covered under copyright.
> The specific case that I am interested in creating new data sets from
> publically available data (itself stored in copyrightable works) - in
> my case to produce interesting data sets to use in class.  For
> example, each individual page on ebay is copyrightable, but if I
> extract the price, name and category from (say) 200 pages, does the
> copyright of that dataset belong to ebay?  I'm quite comfortable using
> that data personally, or for a class, but if I want to publish it (ie.
> in jse) do I need to get permission?  Similarly, if I take a few mp3's
> and calculate some summary statistics for them, would that constitute
> a derivative work?
> Hadley

More information about the R-help mailing list