[R] SAS or R software

Fernando Henrique Ferraz P. da Rosa feferraz at ime.usp.br
Sat Nov 20 14:33:42 CET 2004


neela v writes:
> Hi all there
>  
> Can some one clarify me on this issue, features wise which is better R or SAS, leaving the commerical aspect associated with it. I suppose there are few people who have worked on both R and SAS and wish they would be able to help me in deciding on this.
>  
> THank you for the help
> 

        I'm definitely biased towards R, but I'll try to be the devil's
advocate and point some advantages of SAS.

        - There's a huge collection of data-manipulation features. You
          can parse all sort of weird files using DATA/INPUT statements.
You can reshape, merge, combine, summarize, define and propagate missing
observations, (...) using the inumerous features of the 'SAS DATA Step
Language', as the manuals call it. Sure, there's much of it you can do
in plain R, but R comes from the Unix tradition of specifc tools for
specific tasks, so it doesn't try to be everything to everyone. In
order to match SAS' Data Step Lanaguage using R, you would have to use a
language more fitted for such tasks, like Perl/awk for parsing and a SQL
Database for storing and reshaping. These languages integrate nicely
with R, and if you can really exploit their potential, R + Perl/awk + SQL
will definetly surpass SAS data-manipulation features.

        - There are some statistical analysis which are more completely
          implemented in SAS. PROC VARCOMP comes to mind (I know you can
use lme for mixed models, but you have to use a cumbersome syntax when
you're not dealing with nested but crossed random effects). Sure,
there's nothing stoping you from using R and adapting it to your
needs or implementing the analysis you need to do. While at first it
will take you more time to actually think about the problem and how to
implement it, the flexibility you'll gain will probably pay off.  

        Summarizing, I think the main difference between SAS and R is
their philosophy and how they reflect on their implementations. While SAS
aims to "(...) provide data analysts one system to meet all their
computing needs*", R tries to be the best tool for the specific task of
statistical analysis. SAS's approach allows you to learn only one single
language and use it for almost all computing needs you'll have. While it
seems appealing on first sight, after you reach a certain level of
proficiency, the lack of flexibility of this approach starts to limit
what you can actually do. You'll be locked inside what you can do using
the PROC/DATA procedures provided by SAS; if you want to implement new
analysis etc, you will have a hard time. R's aproach on the other hand,
may seem harder at first, as you'll have to learn (if you don't know
some of them already, that is) specific languages for certain tasks,
such as* LaTeX or HTML for reports, SQL databases, Perl/Awk for data
parsing, C/C++/Fortran for implementing high performance functions, etc.
You'll be higly compensanted though by the gains in flexibility and
productivity. Not to mention that the more prociency you have in R and
the other tools of your choice, more flexibility and power you'll have
at hand to implement new analysis, manipulate data, create reports
dinamically and so on.

        It nails down to how much time/effort you are willing to spend
and for how long you're going to be using the languages. If you prefer
to spend at first a great amount of money, and less time, by learning
only one language, and don't mind being limited in the future on the
range of things you'll be able to do, SAS is the appropriate choice. If
you can spare the time to learn R and a set of appropriate tools for
each task you'll want to do, it'll take you more time at first but in
the end you'll have much more power and flexiblity at hands.



* SAS User's Guide: Basics.
* Strictly speaking you don't have to learn any of those. You can get
  along well using plain R in the beginning, but in order to exploit the
  power of it's approach, you'll find yourself in need to use one or
  more of them.


--
Fernando Henrique Ferraz P. da Rosa
http://www.ime.usp.br/~feferraz




More information about the R-help mailing list