[R] Strange (non-deterministic) problem with strsplit

hadley wickham h.wickham at gmail.com
Fri Jul 16 22:50:03 CEST 2004

I'm having an odd problem with strsplit (well I think it's strplit
that's causing the problem).  When I run the code below as follows:
 str(parseFormulaMin(y +x +d ~ b +d +e| a * b))

I expect to get
List of 3
 $ y: chr "y+x+d"
 $ x: chr "b+d+e"
 $ g: chr "a*b"

But about half the time I get 

List of 3
 $ y: chr "y+x+d"
 $ x: chr "b+d+e"
 $ g: chr "a*[square box]"
(square box not reproduced here because copy and pasting it seems to
break my web mail)

Can anyone reproduce the problem and/or suggest any solutions? 

parseFormula <- function(formula) {
	splitvars <- function(x) {
		strsplit(x, "\\+|\\*")[[1]]
	stripwhitespace <- function(x) {
		gsub("\\s", "", x, perl=T)
	vars <- stripwhitespace(as.character(formula)[3])
	varsplit <- strsplit(vars, "|", fixed=TRUE)[[1]]

	parts <- list(
		y = stripwhitespace(as.character(formula)[2]),
		x = varsplit[1],
		g = varsplit[2]
	lapply(parts, splitvars)



