[R] Why software fails in scientific research

Dr. David Kirkby david.kirkby at onetel.net
Wed Jun 30 19:29:44 CEST 2010


On 03/ 1/10 12:23 AM, Sharpie wrote:
>
>
> John Maindonald wrote:
>>
>> I came across this notice of an upcoming webinar.   The issues identified
>> in the
>> first paragraph below seem to me exactly those that the R project is
>> designed
>> to address.  The claim that "most research software is barely fit for
>> purpose
>> compared to equivalent systems in the commercial world" seems to me not
>> quite accurate!  Comments!


There's probably a lot of truth in those comments.

Generally speaking, publishing results gets rewards in terms of promotion, 
salary etc. Having your code well documented, in revision control systems does 
not. I don't think any amount of


> I personally feel that a lot of this is a result of failing to publish the
> code that was developed to perform research along with the results of the
> research.  When setting out to do start a new project, one can dig up tons
> of journal articles that will happily inform how data was gathered, what
> equations were used and wrap it all up with nicely formatted tables and
> graphs that show X is correlated to Y.

> What these articles fail to report is the code that was developed to filter
> and process the raw data and then apply the equations to produce the figures
> and tables.  The next generation of researchers that are seeking to extend
> the results then end up writing their own code rather than building upon
> what has already been done.

But unless code is well documented, its often quicker to start from scratch anyway.

> The R community has done a tremendous job in encouraging truly reproducible
> research through the package system and tools like Sweave which provide a
> means to combine and maintain data, code and reports-- but we need more.
>
> In my opinion, we need to start seeing websites that provide services
> similar to github or bitbucket-- but with a focus on scientific research.  I
> should be able to set up a versioned repository somewhere in the cloud for
> my research projects that hosts not only my code, but my data and reports.
> I could then choose to make this resource publicly available and other
> researchers could fork my work with a single mouse click and start
> collaborating on my project or extend what I've done into a project of their
> own.

But a lot of academics are not going to "waste" their time documenting code 
properly, so others can reap the benefits of it. They would rather get on with 
the next project, to get the next paper.

FTP sites have existed for years. If people want to make their data analysis 
code available, it is not hard. But I think it would need a change of attitude 
more than any technical advance.

> And that's my two cents on the state of software in research.
>
> -Charlie

And there is my two pennies!

Dave



More information about the R-help mailing list