[Rd] segfault / crash when asking for large memory via strrep()

luke-tierney at uiowa.edu luke-tierney at uiowa.edu
Wed Jun 1 20:38:34 CEST 2016


I've added a size/overflow check before the buffer allocation in
R-devel and R-patched.

It would be a good idea sometime to review the use of

calloc
...
free

patterns to make sure the ... can't raise an error or otherwise jump
and leave the memory pointer dangling.

Best,

luke

On Wed, 1 Jun 2016, luke-tierney at uiowa.edu wrote:

> That would be because the product nc * ni overflows in
>
>    cbuf = buf = CallocCharBuf(nc * ni);
>
> Since we disallow strings with more than 2^31-1 bytes we could test
> and reject this. It might be more future-proof to change the
> declaration of
>
>    int j, ni, nc;
>
> to
>
>    R_xlen_t j, ni, nc;
>
> and let the character allocation code reject, but that would create a
> memory leak since the Free call isn't reached. This is a problem in
> any case though, as
>
> SET_STRING_ELT(s, is, markKnown(cbuf, STRING_ELT(x, ix)));
>
> could throw errors for a number of reasons and then the Free() is not
> reached. It would be better to use R_alloc or register a cleanup
> function to call Free on a jump.
>
> Best,
>
> luke
>
> On Wed, 1 Jun 2016, Martin Maechler wrote:
>
>> We've had this more general topic on R-help,  and also in R-devel 
>> recently.
>> There's one case here where I get the feeling R never gets into
>> swapping but more directly aborts possibly from a bug we can
>> more easily fix.
>> 
>> Today I've been working (successfully! - not yet committed) at
>> fixing  str() for very large strings.
>> 
>> In this process, I've found that
>>
>>   pc <- function(.) paste(., collapse=".1.2.3.4.5.")
>>   p  <- function(.) strrep(pc(.), 64L)
>>   p(p(p(p(LETTERS))))
>> 
>> produces a (memory related) segmentation fault (aka "crash")
>> very reproducibly and relatively quickly
>> both on my Linux (Fedora 22) desktop and on our Windows server.
>> 
>> *** caught segfault ***
>> address 0x7fc52dc89000, cause 'memory not mapped'
>> 
>> Traceback:
>> 1: strrep(pc(.), 64L)
>> 2: p(p(p(p(LETTERS))))
>> 3: system.time(L2 <- p(p(p(p(LETTERS)))))
>> 
>> In the debugger, the symptoms point to the possibility of a
>> bug just in the C parts of strrep() :
>> 
>> 
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x00007ffff54d6223 in __strcpy_sse2_unaligned () from 
>> /usr/lib64/libc.so.6
>> Missing separate debuginfos, use: dnf debuginfo-install 
>> bzip2-libs-1.0.6-14.fc22.x86_64 libgcc-5.3.1-6.fc22.x86_64 
>> libgfortran-5.3.1-6.fc22.x86_64 libgomp-5.3.1-6.fc22.x86_64 
>> libicu-54.1-4.fc22.x86_64 libquadmath-5.3.1-6.fc22.x86_64 
>> libstdc++-5.3.1-6.fc22.x86_64 ncurses-libs-5.9-18.20150214.fc22.x86_64 
>> pcre-8.38-4.fc22.x86_64 readline-6.3-5.fc22.x86_64 
>> xz-libs-5.2.0-2.fc22.x86_64 zlib-1.2.8-7.fc22.x86_64
>> (gdb) bt
>> #0  0x00007ffff54d6223 in __strcpy_sse2_unaligned () from 
>> /usr/lib64/libc.so.6
>> #1  0x0000000000457def in do_strrep (call=<optimized out>, op=<optimized 
>> out>, args=<optimized out>,
>>    env=<optimized out>) at ../../../R/src/main/character.c:1658
>> #2  0x00000000004d6844 in bcEval (body=body at entry=0xd66840, 
>> rho=rho at entry=0x45253b8,
>>    useCache=useCache at entry=TRUE) at ../../../R/src/main/eval.c:5648
>> #3  0x00000000004dd240 in Rf_eval (e=0xd66840, rho=0x45253b8) at 
>> ../../../R/src/main/eval.c:616
>> #4  0x00000000004dedaf in Rf_applyClosure (call=call at entry=0x45250a8, 
>> op=op at entry=0xd668e8,
>>    arglist=0x45251f8, rho=rho at entry=0x4525000, suppliedvars=0xa57188)
>>    at ../../../R/src/main/eval.c:1134
>> #5  0x00000000004dd3b1 in Rf_eval (e=0x45250a8, rho=0x4525000) at 
>> ../../../R/src/main/eval.c:732
>> #6  0x00000000004dedaf in Rf_applyClosure (call=call at entry=0x4525718, 
>> op=op at entry=0x4524d28,
>>    arglist=0x4524f90, rho=rho at entry=0xa8ea30, suppliedvars=0xa57188)
>>    at ../../../R/src/main/eval.c:1134
>> #7  0x00000000004dd3b1 in Rf_eval (e=0x4525718, rho=0xa8ea30) at 
>> ../../../R/src/main/eval.c:732
>> #8  0x00000000004e0cde in do_set (call=0x4525670, op=0xa61358, 
>> args=<optimized out>, rho=0xa8ea30)
>>    at ../../../R/src/main/eval.c:2196
>> 
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
>
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney at uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



More information about the R-devel mailing list