[R] Reasons to Use R

Wensui Liu liuwensui at gmail.com
Wed Apr 11 23:08:45 CEST 2007


I think the reason that stata is fast is because it only keeps 1 work
table in ram. if you just keep 1 data frame in R, it will run fast
too. But ...

On 4/11/07, Robert Duval <rduval at gmail.com> wrote:
> So I guess my question is...
>
> Is there any hope of R being modified on its core in order to handle
> more graciously large datasets? (You've mentioned SAS and SPSS, I'd
> add Stata to the list).
>
> Or should we (the users of large datasets) expect to keep on working
> with the present tools for the time to come?
>
> robert
>
> On 4/11/07, Marc Schwartz <marc_schwartz at comcast.net> wrote:
> > On Wed, 2007-04-11 at 11:26 -0500, Marc Schwartz wrote:
> > > On Wed, 2007-04-11 at 17:56 +0200, Bi-Info
> > > (http://members.home.nl/bi-info) wrote:
> > > > I certainly have that idea too. SPSS functions in a way the same,
> > > > although it specialises in PC applications. Memory addition to a PC is
> > > > not a very expensive thing these days. On my first AT some extra memory
> > > > cost 300 dollars or more. These days you get extra memory with a package
> > > > of marshmellows or chocolate bars if you need it.
> > > > All computations on a computer are discrete steps in a way, but I've
> > > > heard that SAS computations are split up in strictly divided steps. That
> > > > also makes procedures "attachable" I've been told, and interchangable.
> > > > Different procedures can use the same code which alternatively is
> > > > cheaper in memory usages or disk usage (the old days...). That makes SAS
> > > > by the way a complicated machine to build because procedures who are
> > > > split up into numerous fragments which make complicated bookkeeping. If
> > > > you do it that way, I've been told, you can do a lot of computations
> > > > with very little memory. One guy actually computed quite complicated
> > > > models with "only 32MB or less", which wasn't very much for "his type of
> > > > calculations". Which means that SAS is efficient in memory handling I
> > > > think. It's not very efficient in dollar handling... I estimate.
> > > >
> > > > Wilfred
> > >
> > > <snip>
> > >
> > > Oh....SAS is quite efficient in dollar handling, at least when it comes
> > > to the annual commercial licenses...along the same lines as the
> > > purported efficiency of the U.S. income tax system:
> > >
> > >   "How much money do you have?  Send it in..."
> > >
> > > There is a reason why SAS is the largest privately held software company
> > > in the world and it is not due to the academic licensing structure,
> > > which constitutes only about 12% of their revenue, based upon their
> > > public figures.
> >
> > Hmmm......here is a classic example of the problems of reading pie
> > charts.
> >
> > The figure I quoted above, which is from reading the 2005 SAS Annual
> > Report on their web site (such as it is for a private company) comes
> > from a 3D exploded pie chart (ick...).
> >
> > The pie chart uses 3 shades of grey and 5 shades of blue to
> > differentiate 8 market segments and their percentages of total worldwide
> > revenue.
> >
> > I mis-read the 'shade of grey' allocated to Education as being 12%
> > (actually 11.7%).
> >
> > A re-read of the chart, zooming in close on the pie in a PDF reader,
> > appears to actually show that Education is but 1.8% of their annual
> > worldwide revenue.
> >
> > Government based installations, which are presumably the other notable
> > market segment in which substantially discounted licenses are provided,
> > is 14.6%.
> >
> > The report is available here for anyone else curious:
> >
> >   http://www.sas.com/corporate/report05/annualreport05.pdf
> >
> > Somebody needs to send SAS a copy of Tufte or Cleveland.
> >
> > I have to go and rest my eyes now...  ;-)
> >
> > Regards,
> >
> > Marc
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)



More information about the R-help mailing list