[R] string syntactic sugar in R? - long post

charles loboz charles_loboz at yahoo.com
Sat May 7 10:36:48 CEST 2005


Currently in R, constructing a string containing
values of variables is done using 'paste' and can be
an error-prone and traumatic experience. For example,
when constructing a db query we have to write,
          paste("SELECT " value " FROM table  where
date ='",cdate,"'")
we are getting null result from it, because without
(forgotten...) sep=""  we get
         SELECT value FROM table where date='
2005-05-05 '
instead of
	SELECT value FROM table where date='2005-05-05'
Adding sep="" as a habit results in other errors, like
column names joined with keywords - because of
forgotten spaces. Not to mention mixing up or
unbalancing quote marks etc. The approach used by
paste is similar to that of many other languages (like
early Java, VB etc) and is inherently error-prone
because of poor visualization. There is a way to
improve it.

In the Java world gstrings were introduced
specifically for this purpose. A gstring is a string
with variable names embedded and replaced by values
(converted to strings, lazy eval) before use. An
example in R-syntax would be:

>alpha <- 8; beta="xyz"
>gstr <- "the result is ${alpha} with the comment
${beta}"
>cat(gstr)
      the result is 8 with the comment xyz

This syntactic sugar reduces significantly the number
of mistakes made with normal string concatenations.
Gstrings are used in ant and groovy - (for details see
http://groovy.codehaus.org/Strings, jump to GStrings).
They are particularly useful for creating readable and
error-free SQL statements, but obviously the simplify
'normal' string+value handling in all situations. [ps:
gstrings are not nestable]

I was wondering how difficult it would be to add such
syntactic sugar to R and would that create some
language problems? May be it is possible that it could
be done as some gpaste function, parsing the argument
for ${var}, extracting variables from the environment,
evaluating them and producing the final string?

I admit my bias - using ant for years and groovy for
months and having to do a lot of SQL queries does not
put me in the mainstream of R users - so it may be
that this idea is not usable to a wider group of
users.




More information about the R-help mailing list