[R] Inefficiency of SAS Programming
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Fri Feb 27 01:13:53 CET 2009
Barry Rowlingson wrote:
> 2009/2/26 Frank E Harrell Jr <f.harrell at vanderbilt.edu>:
>> If anyone wants to see a prime example of how inefficient it is to program
>> in SAS, take a look at the SAS programs provided by the US Agency for
>> Healthcare Research and Quality for risk adjusting and reporting for
>> hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm .
>> The PSSASP3.SAS program is a prime example. Look at how you do a vector
>> product in the SAS macro language to evaluate predictions from a logistic
>> regression model. I estimate that using R would easily cut the programming
>> time of this set of programs by a factor of 4.
>
> Plenty of examples ripe for sending to www.thedailywtf.com there. Like this:
>
> IF &N. = 1 THEN SUB_N = 1;
> IF &N. = 3 THEN SUB_N = 2;
> IF &N. = 4 THEN SUB_N = 3;
> IF &N. = 6 THEN SUB_N = 4;
> IF &N. = 7 THEN SUB_N = 5;
> IF &N. = 8 THEN SUB_N = 6;
> IF &N. = 9 THEN SUB_N = 7;
> IF &N. = 10 THEN SUB_N = 8;
> IF &N. = 11 THEN SUB_N = 9;
> IF &N. = 12 THEN SUB_N = 10;
> IF &N. = 13 THEN SUB_N = 11;
> IF &N. = 14 THEN SUB_N = 12;
> IF &N. = 15 THEN SUB_N = 13;
> IF &N. = 17 THEN SUB_N = 14;
> IF &N. = 18 THEN SUB_N = 15;
> IF &N. = 19 THEN SUB_N = 16;
>
> Of course it's possible to write code like that in any language, it
> just looks worse when it's in ALL CAPS and written in a style that
> looks like the 1980s and onward never happened. The question is
> whether it's possible to write this better in SAS. Most of us on this
> list could write it in R in a better way.
Presumably, something like
IF &N. = 1 THEN SUB_N = 1;
ELSE IF &N. < 5 THEN SUB_N = &N.-1;
ELSE IF &N. < 16 THEN SUB_N = &N.-2;
ELSE SUB_N = &N.-3;
would work, provided that 2, 5, 16 are impossible values. Problem is
that it actually makes the code harder to grasp, so experienced SAS
programmers go for the dumb but readable code like the above.
In R, the cleanest I can think of is
subn <- match(n, setdiff(1:19, c(2,5,16)))
or maybe just
subn <- match(n, c(1, 3:4, 6:15, 17:19))
although
subn <- factor(n, levels = c(1, 3:4, 6:15, 17:19))
might be what is really wanted
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list