[R] [SUMMARY] Converting coef(lm) to SQL/VBA/etc

j+rhelp@howard.fm j+rhelp at howard.fm
Sun Feb 16 07:34:03 CET 2003


Many thanks to all who helped with my question last week about how to
take the output of lm() and turn it into code that can be run on systems
without R (using SQL, C, etc). This is a summary of the answers, caveats,
and a solution including a little Perl script I wrote to do this.

Brian Ripley pointed out an important caveat to this whole process - the
model.matrix could contain R functions that simply don't exist in other
environments. Therefore, any solution will need to be somewhat special
purpose - avoiding functions that are not implemented in the target
language (or porting them to that language).

My simplistic solution is to ensure that all required transformations
from the raw data to the model.matrix is done in the original data source
- which, in most cases, will probably be an SQL data base. So in my case
I added a bunch of extra fields to the query I used as my data source,
containing any required transformations (with the exception of recoding
factors into dummy variables, which I'll get too shortly).

Frank Harrel pointed out that his Design library has a function called,
err..., "Function", which does pretty much what I was after. To use this,
grab the library from:
  http://hesweb1.med.virginia.edu/biostat/s/library/r/
...and instead of running lm(), run the library's ols() . Then
'Function(ols(y~x))' gives you an expression representing your model. It
should be reasonably simply to convert that expression into your target
language.

Unfortunately, it turns out that ols() can't handle nested designs
('%in%', '/', ':'), so this solution didn't work for me.

John Fox provided an (unreleased) addition to his 'car' library which
enhances the contr functions to delimit the levels from the factor names
and encode the contrast type in the contrast names. A coefficient then
will have a name like:
  a:b[T.c]:d
(an interaction between continuous variables a and d, and level c of
factor b, which is of contrast type 'T' (Treatment)). For now you will
need to look up these updated functions in the list archive - I'm sure
they'll appear in a new 'car' release shortly.

I used these functions as the basis for my converter. Here's the
details...

First, specify to use John Fox's new contr.Treatment function (note the
case!):
  options(contrasts=c("contr.Treatment", "contr.poly"))

Now run your lm() as usual
  mod <- lm( . ~ . )
...and save the coefficients
  write.table(mod$coeff, file="mod.csv", sep=",")

Now, I needed two target languages for my model, which were SQL (using a
CASE statement) and VBA. These two targets meant I could utilise my model
in any ANSI-92 compliant DB (since CASE is ANSI-92), and also in the MS
Access DB (since the Jet database engine can utilise any VBA function
from within SQL). The attached script can convert from the csv file saved
above into either of these two formats. To use it:
 - Download Perl if you don't already have it
 - Install Text::CSV_XS
   . Windows: ppm; install Text-CSV_XS; q
   . Unix: perl -MCPAN -e shell; install Text::CSV_XS; q
 - Run `perl lm2code.pl mod vba`
   or `perl lm2code.pl mod sql`
   (where 'mod' is the filename without extension
   from write.table)

To convert to some target language other than VBA or SQL, simply open
lm2code.pl (attached) and add your own entry to the %output hash at the
top of the script. Use the VBA and SQL examples already there to guide
you.

Of course, it's a bit silly using Perl for this last stage, when doing it
in R would make it much simpler. Maybe that's a useful update for a
future version, unless something better comes along first...

Note that the script only handles Treatment contrasts, and lm models.

Hope this helps someone,
  Jeremy
-- 
  Jeremy Howard
  jhoward at fastmail.fm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lm2code.pl
Type: application/x-perl
Size: 3663 bytes
Desc: not available
Url : https://stat.ethz.ch/pipermail/r-help/attachments/20030216/92ec2393/lm2code.bin


More information about the R-help mailing list