[Rd] sub( , perl = TRUE) overflow (PR#7479)

Prof Brian Ripley ripley at stats.ox.ac.uk
Sat Jan 8 08:25:53 CET 2005


On Sat, 8 Jan 2005 Robert.McGehee at geodecapital.com wrote:

> I'd like to report a bug (buffer overflow?) in the function sub(..., perl = TRUE)
>
> I wanted to implement the familiar perl function for removing white spaces before and after a character string:
> sub trimwhitespace($)
> {
> 	my $string = shift;
> 	$string =~ s/^\s+//;
> 	$string =~ s/\s+$//;
> 	return $string;
> }
>
> So in R this would (presumably) become:
>
> trimwhitespace <- function(x) {
>    x <- sub('^\\s+', '', x, perl = TRUE) ## Removes preceding white spaces
>    x <- sub('\\s+$', '', x, perl = TRUE) ## Removes trailing white spaces
>    x
> }
>
> Expected behavior:
>> trimwhitespace("                     abc")
> [1] "abc"
>
> On Windows:
>> trimwhitespace("                     abc")
> [1] "abc\0\220\277\036\001\220°ß\08iW\001p±ß\0X°ß\0"        ## That's not good! Looks like a buffer overflow
>
> On Linux:
> [1] "abc\0\0\002\0\0 \377\0\0\0\002\0\0\0\006\0\0/\377\0\0" ## Linux goofs as well!
>
> Debugging shows that it is the first line in the function that produces 
> the overflow. The overflow seems proportional to the about of preceding 
> white spaces. I'm not sure if this is exploitable or not, but it might 
> be possible to run arbitrary code stored in a character object using 
> this.

Don't think so.  It's actually just a printing issue: the length used for 
printing is marked incorrectly (as the length of the original string), and
you won't be able to access the character string past the \0 in any other 
way.

I've fixed it now.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-devel mailing list