[R] valid package repositories
jdnewmil at dcn.davis.ca.us
Mon Oct 2 19:25:10 CEST 2017
I tend to regard GitHub as a bit of wild west... anyone can upload anything there, working or not. CRAN packages at least have to compile so there is some additional verification in being there.
GitHub does have the advantage that you can easily download it and run an example if the authors have set up such scaffolding... which is better than "it ran once on that laptop that died". However, there is a distinct extra level of sophistication involved in getting researchers to make those examples or test cases beyond their mainline code, and nothing about GitHub requires that such features be present in uploaded code.
Sent from my phone. Please excuse my brevity.
On October 2, 2017 7:47:35 AM PDT, Federico Calboli <federico.calboli at kuleuven.be> wrote:
>I noticed that it is quite common to find in papers mentions to ‘R
>libraries’ developed for the algorithms/models/code/whatever that is
>being described by the paper, so that third parties will be able to use
>said method for themselves. On further enquiries these libraries are
>not actually available on CRAN, but need to be requested from the devs.
>That is in itself does not seem a big issue, were it not for the fact
>most of the time I am in such situation the code is very specific for
>the environment of the developer, and does not actually work on any
>machine I try to run it on (something that is painfully true for code
>calling C/C++/Fortran). A second pattern I seem to have noticed is
>that, despite said libraries being advertised for general use in a
>*published* paper, when I raise the issue the library is not actually
>formally published and it does not actually work like a CRAN published
>library would, I get a vague ‘the person who actually did the work left
>and nobody can maintain the code/fix stuff/finish the job’.
>As a referee I am trying to weed out what I see as malpractice: the
>promise that third parties outside the developers might actually use
>the code because it has been packaged as a R library, a claim that
>seems to boost publishing chances.
>Thus my question: when can I consider a library to be properly
>published and really publicly available? CRAN and BioConductor are
>clearly gold standards. What about Github? I am currently using the
>rule ‘not on CRAN == outright rejection’. If Github is as good as CRAN
>I will include it on my list of ‘the code is available in a functional
>state as claimed’.
>Finally, please note the scope of my query: I am not looking at those
>cases where a colleague gives me half finished code that might be
>useful but I need to sort out. I am looking at formal claims ‘we have
>developed a method to do X and said method is available to the public
>as a R library’. If that is the claim I expect it to be true.
>LBEG - Laboratory of Biodiversity and Evolutionary Genomics
>Charles Deberiotstraat 32 box 2439
>+32 16 32 87 67
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>PLEASE do read the posting guide
>and provide commented, minimal, self-contained, reproducible code.
More information about the R-help