[Rd] faster base::sequence

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Nov 28 10:30:38 CET 2010


Is sequence used enough to warrant this?  As the help page says

      Note that ‘sequence <- function(nvec) unlist(lapply(nvec,
      seq_len))’ and it mainly exists in reverence to the very early
      history of R.

I regard it as unsafe to assume that NA_INTEGER will always be 
negative, and bear in mind that at some point not so far off R 
integers (or at least lengths) will need to be more than 32-bit.

On Sun, 28 Nov 2010, Romain Francois wrote:

> Hello,
>
> Based on yesterday's R-help thread (help: program efficiency), and following 
> Bill's suggestions, it appeared that sequence:
>
>> sequence
> function (nvec)
> unlist(lapply(nvec, seq_len))
> <environment: namespace:base>
>
> could benefit from being written in C to avoid unnecessary memory 
> allocations.
>
> I made this version using inline:
>
> require( inline )
> sequence_c <- local( {
>    fx <- cfunction( signature( x = "integer"), '
>        int n = length(x) ;
>        int* px = INTEGER(x) ;
>        int x_i, s = 0 ;
>        /* error checking */
>        for( int i=0; i<n; i++){
>            x_i = px[i] ;
>            /* this includes the check for NA */
>            if( x_i <= 0 ) error( "needs non negative integer" ) ;
>            s += x_i ;
>        }
>
>        SEXP res = PROTECT( allocVector( INTSXP, s ) ) ;
>        int * p_res = INTEGER(res) ;
>        for( int i=0; i<n; i++){
>            x_i = px[i] ;
>            for( int j=0; j<x_i; j++, p_res++)
>                *p_res = j+1 ;
>        }
>        UNPROTECT(1) ;
>        return res ;
>    ' )
>    function( nvec ){
>        fx( as.integer(nvec) )
>    }
> })
>
>
> And here are some timings:
>
>> x <- 1:10000
>> system.time( a <- sequence(x ) )
> utilisateur     système      écoulé
>      0.191       0.108       0.298
>> system.time( b <- sequence_c(x ) )
> utilisateur     système      écoulé
>      0.060       0.063       0.122
>> identical( a, b )
> [1] TRUE
>
>
>
>> system.time( for( i in 1:10000) sequence(1:10) )
> utilisateur     système      écoulé
>      0.119       0.000       0.119
>>
>> system.time( for( i in 1:10000) sequence_c(1:10) )
> utilisateur     système      écoulé
>      0.019       0.000       0.019
>
>
> I would write a proper patch if someone from R-core is willing to push it.
>
> Romain
>
> -- 
> Romain Francois
> Professional R Enthusiast
> +33(0) 6 28 91 30 30
> http://romainfrancois.blog.free.fr
> |- http://bit.ly/9VOd3l : ZAT! 2010
> |- http://bit.ly/c6DzuX : Impressionnism with R
> `- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-devel mailing list