[Rd] memory misuse in subscript code when rep() is called in odd way

Seth Falcon seth at userprimary.net
Wed Nov 4 06:40:04 CET 2009


Hi,

On 11/3/09 2:28 PM, William Dunlap wrote:
> The following odd call to rep()
> gives somewhat random results:
>
>> rep(1:4, 1:8, each=2)

I've committed a fix for this to R-devel.

I admit that I had to reread the rep man page as I first thought this 
was not a valid call to rep since times (1:8) is longer than x (1:4), 
but closer reading of the man page says:

   > If times is a vector of the same length as x (after replication
   > by each), the result consists of x[1] repeated times[1] times,
   > x[2] repeated times[2] times and so on.

So the expected result is the same as rep(rep(1:4, each=2), 1:8).

> valgrind says that the C code is using uninitialized data:
>> rep(1:4, 1:8, each=2)
> ==26459== Conditional jump or move depends on uninitialised value(s)
> ==26459==    at 0x80C557D: integerSubscript (subscript.c:408)
> ==26459==    by 0x80C5EDC: Rf_vectorSubscript (subscript.c:658)

A little investigation seems to suggest that the problem is originating 
earlier.  Debugging in seq.c:do_rep I see the following:

 > rep(1:4, 1:8, each=2)

Breakpoint 1, do_rep (call=0x102de0068, op=<value temporarily 
unavailable, due to optimizations>, args=<value temporarily unavailable, 
due to optimizations>, rho=0x1018829f0) at 
/Users/seth/src/R-devel-all/src/main/seq.c:434
434         ans = do_subset_dflt(R_NilValue, R_NilValue, list2(x, ind), 
rho);
(gdb) p Rf_PrintValue(ind)
  [1]          1          1          1          2          2          2
  [7]          2          2          2          2          3          3
[13]          3          3          3          3          3          3
[19]          3          3          3          4          4          4
[25]          4          4          4          4          4          4
[31]          4          4          4          4          4          4
[37]   44129344          1   44129560          1   44129776          1
[43]   44129992          1   44099592          1   44099808          1
[49]   44100024          1   44100456          1    2724144    3801089
[55] -536870733          0   54857992          1   22275728          1
[61]    2724144          1         34          1   44100744          1
[67]   44100960          1   44101176          1   43652616          1
$2 = void
(gdb) c
Continuing.
Error: only 0's may be mixed with negative subscripts

The patch I applied adjusts how the index vector length is computed when 
times has length more than one.

+ seth



More information about the R-devel mailing list