[R] Excel

christian.ritter at shell.com christian.ritter at shell.com
Wed Aug 29 09:15:55 CEST 2007


Hmm,

Excel bashing always brings joy ... but then again, it's a user's community of more than 100 million people, and if one is careful one can do quite a few interesting things with excel, in particular if it can be extended by R (using R(D)COM by T Baier and RExcel by E Neuwirth). 

Typically, people in the R community are not used to the spreadsheet paradigm and need some time to be able to take advantage of automatic recalculation, cross tabulation (Pivot tables), automatic tabulation of nontrivial expressions (data tables) and do not know that many of the matrix calculations which we commonly do in R can also be carried out by array formulas in Excel (or Gnumeric, if you don't want to stay with a single spreadsheet). With a little experience one can program interactive tools such as to do multiple ridge regression including variables selection and exclusion/inclusion of observations directly in such a spreadsheet. Or, one can just program the interface in the spreadsheet and have R do the calculations. 

I think that any serious statistical consultant should be able to combine the power of a spreadsheet with the one of a scripting language (and a relational data base in addition to this). Excel is interesting in this context since it is so widely availabe, since it has a scripting language, and since it can be coupled with R. Partial coupling is also possible in gnumeric (under linux) but not yet under windows (I asked for this a while ago, but as far as I know, the scripting interface - based on Python - doesn't work yet). An equivalent of RExcel/R(D)COM is under development for calc, the open office spreadsheet. However, so far I was not impressed about the quality of calc (can be very slow, hungry for memory, etc). 

Here are a few additional comments related to the representation issue in .csv files:
What is said about the .csv files with respect to rounding also holds for the windows clipboard but not for the office clipboard. If you format data in an excel range, select this range and paste it on a different worksheet (within MS Office) the original representation is kept. That is, you can undo the formating in the new copy. However, if you read the data into R using the clipboard as a data source, only the formated version is transfered. I played a bit with options and it really seems a clipboard implementation issue (a job for Microsoft). Any lobbying wit MS to permit a better access to the office clipboard would be useful in this context.

Have a nice day,

Chris

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Rolf Turner
Sent: Tuesday, 28 August, 2007 10:01 PM
To: J Dougherty
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Excel



On 28/08/2007, at 7:16 PM, J Dougherty wrote:

	<snip>

> PS, I quit using Excel for most important work after it returned a  
> negative
> variance on some data I was collecting descriptive statistics on.

Those of you who have not seen it should have a look at Jonathan  
Cryer's commentary
on Excel, available at the URL:

		http://www.stat.uiowa.edu/~jcryer/JSMTalk2001.pdf

Executive summary:  Friends don't let friends use Excel for statistics.

			cheers,

				Rolf Turner

######################################################################
Attention:\ This e-mail message is privileged and confidenti...{{dropped}}

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list