[R] Issue with dataset inclusion in CRAN packages

Frank Harrell f.harrell at vanderbilt.edu
Sun Jun 26 22:43:05 CEST 2011


I was glad to see the new rpart.plot package by Stephen Milborrow.  I was
however a bit concerned that Stephen distributed a dataset I created, and
renamed the dataset (from titanic3 to ptitanic) in the process [with some
justification, as some variables were omitted].  Fortunately Stephen
included the script he used to download the dataset from our web site, and
gave full credit to us.  What concerns me is that the rpart.plot package
does not contain many functions but the package is as large as packages
containing hundreds of functions.  This is due to the inclusion of the
dataset.  I would prefer that authors provide the URL so that users can
easily install the binary R binary dataframe directly from our web site (we
even have an automated way to do this: require(Hmisc); getHdata(titanic3)). 
This will allow users to profit from possible future data corrections as
well as making the package much more compact.  Thanks for listening.  I'm
writing to r-help because this may applied to other R packages as well.

Frank


-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/Issue-with-dataset-inclusion-in-CRAN-packages-tp3626536p3626536.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list