[R] About performance of R

Duncan Murdoch murdoch.duncan at gmail.com
Wed May 27 20:52:19 CEST 2015


On 27/05/2015 11:00 AM, Suman wrote:
> Hi there,
>
> Now that R has grown up with a vibrant community. It's no 1 statistical package used by scientists. It's graphics capabilities are amazing.
> Now it's time to provide native support in "R core" for distributed and parallel computing for high performance in massive datasets.
> And may be base R functions should be replaced with best R packages like data.table, dplyr, reader for fast and efficient operations.

Given your first three sentences, I would say the current development 
strategy for R is successful.  As Bert mentioned, one thing we have 
always tried to do is to make improvements without large disruptions to 
the existing code base.  I think we will continue to do that.

This means we are unlikely to make big, incompatible replacements. But 
there's nothing stopping people from using data.table, dplyr, etc. even 
if they aren't in the core.  In fact, having them outside of core R is 
better:  there are only so many core R developers, and if they are 
working on data.table, etc., they wouldn't be working on other things.

Compatible replacements are another question.  There is ongoing work on 
making R faster, and making it easier to take advantage of multiple 
processors.  I believe R 3.2.0 is faster than the R 3.1.x series in many 
things, and changes like that are likely to continue.  Plus, there is 
base support for explicit parallel programming in the parallel package, 
as Jeff mentioned.

As to David and his large bundles; those would definitely be appreciated.

Duncan Murdoch



More information about the R-help mailing list