[R] Inefficiency of SAS Programming

Frank E Harrell Jr f.harrell at vanderbilt.edu
Sat Feb 28 00:23:26 CET 2009


John Sorkin wrote:
> Frank,
> A programming language's efficience is a function of several items, including what you are trying to program. Without using SAS proc IML, I have found that it is more efficient to code algorithms (e.g. a least squares linear regression) using R than SAS; we all know that matrix notation leads to more compact syntax than can be had when using non-matrix notation and R implements matrix notation. On the other hand, searching, sub-setting, merging etc. can a times be coded more efficiently, more easily, and in a more easily understood fashion is SAS. I am sure you people who use SAS to set up their datasets and then use R when they are developing an algorithm. 
> 
> Just as French may be a better language to express love, Italian a better language in which to write opera, and English the most efficient language for communication (at least for the last 50 years), so too do both R and SAS have a place in the larger world.
> John     

John I'll have to strongly disagree with most of your statement about 
data manipulation.  R is far more powerful, easier to debug dynamically, 
and concise for merging, reshaping, recoding, etc.  But I agree on the 
"easily understood" portion of your statement.

Cheers
Frank

> 
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> 
>>>> Frank E Harrell Jr <f.harrell at vanderbilt.edu> 2/27/2009 12:52 PM >>>
> John Sorkin wrote:
>> Terry's remarks (see below) are well received however, I take issue with one part of his comments. As a long time programmer (in both "statistical" programming languages and "traditional" programming languages), I miss the ability to write native-languages in R. While macros can make for difficult to read code, when used properly, they can also make flexible code that, if properly written (including good documentation, which should be a part of any code) can be easy to read.
>>
>> Finally, everyone must remember that SAS code can be difficult to understand or "inefficient" just as R code can be difficult to understand or "inefficient". In the end, both programming systems have their advantages and disadvantage. No programming language is perfect. It is not fair, nor correct to damn one or the other. Accept the fact that some things are more easily and more clearly done in one language, other things are more clearly and more easily done in another language.  Let's move on to more important issues, viz. improving R so it is as good as it possibly can be.
>> John  
> 
> Nice points John.  My only response is that I learned SAS in 1969 and 
> used it intensively until 1991.  I wrote some of the first 
> user-contributed SAS procedures (PROCs PCTL, GRAPH, DATACHK, LOGIST, 
> PHGLM) and wrote extensively in the macro language.  After using S-Plus 
> for only one month my productivity was far ahead of my productivity 
> using SAS.
> 
> Frank
> 
>>   
>>
>> John David Sorkin M.D., Ph.D.
>> Chief, Biostatistics and Informatics
>> University of Maryland School of Medicine Division of Gerontology
>> Baltimore VA Medical Center
>> 10 North Greene Street
>> GRECC (BT/18/GR)
>> Baltimore, MD 21201-1524
>> (Phone) 410-605-7119
>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>>
>>>>> Terry Therneau <therneau at mayo.edu> 2/27/2009 10:23 AM >>>
>> Three comments
>>
>>  I actually think you can write worse code in R than in SAS: more tools = more 
>> scope for innovatively bad ideas.  The ability to write bad code should not damm 
>> a language.  
>>  
>>   I found almost all of the "improvements" to the multi-line SAS recode to be 
>> regressions, both the SAS and the S suggestions. 
>>     a. Everyone, even those of you with no SAS backround whatsoever, immediately 
>> understood the code.  Most of the replacements are obscure.  Compilers are very 
>> good these days and computers are fast, fewer typed characters != better.
>>     b. If I were writing the S code for such an application, it would look much 
>> the same.  I worked as a programmer in medical research for several years, and 
>> one of the things that moved me on to graduate studies in statistics was the 
>> realization that doing my best work meant being as UN-clever as possible in my 
>> code.  
>>     
>>   Frank's comments imply that he was reading SAS macro code at the moment of 
>> peak frustration.  And if you want to criticise SAS code, this is the place to 
>> look.  SAS macro started out as some simple expansions, then got added on to, 
>> then added on again, and again, and ....  with no overall blueprint.  It is much 
>> like the farmhouse of some neighbors of mine growing up: 4 different expansions 
>> in 4 eras, and no overall guiding plan.  The interior layout was "interesting" 
>> to say the least. I was once a bona fide SAS 'wizard' (and Frank was much better 
>> than me), and I can't read the stuff without grinding my teeth.
>>   S was once headed down the same road. One of the best things ever with the 
>> language was documented in the blue book "The New S Language", where Becker et 
>> al had the wisdom to scrap the macro processor.  
>>  
>>   	Terry Therneau
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> Confidentiality Statement:
>> This email message, including any attachments, is for th...{{dropped:6}}
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list