[Rd] Excel -> *.CSV in Unix (Linux) command line?

Kurt Hornik Kurt.Hornik@ci.tuwien.ac.at
Mon, 20 Aug 2001 18:38:51 +0200


>>>>> Martin Maechler writes:

> Thanks got to Brian Ripley, John Chambers, Tony Rossini, Setzer Woodrow,
> Dirk Eddelbuettel, Uwe Ligges, Detlef Steuer, and Kurt Hornik (in
> "historical" order) who have answered very helpfully!

> I was interested primarily in "commandline" solutions inside Unix/Linux,
> and hence not ODBC inside windows.  The other answers fall into three groups :

> 1) newer Perl versions (newer than Redhat 6.2 or Debian potato) have modules
>    Spreadsheet::ParseExcel and built on that DBI::Excel
>    {as "libspreadsheet-parseexcel-perl" and "libdbi-excel-perl" in Debian,
>     thanks to Dirk}.

>    Dirk has also done a done a perl script "xls2csv" built on the above,
>    sent to Brian Ripley who forwarded it to me.  
>    Dirk, maybe you'd post the latest version of that here as well, or give us
>    a pointer where that is available?

> 2) There's a GPL package "catdoc" with primary aim of 
>    "cat *.doc files" (i.e. `look at' M$-Word files).  The catdoc package is
>    written in C and also has an "xls2csv" (same name as above script!) program.
>    catdoc is in Debian or available at  http://www.fe.msk.ru/~vitus/catdoc/
>    Looking at that webpage however seems to indicate that development
>    somehow stopped end of 1999 (a "2k" bug ? :-).

> I've not yet compared the two xls2csv versions now potentially available
> to me, one reason being that I'd have to upgrade perl (or Redhat) first.
> At the moment, I'd tend to use the perl based solution.

> 3) John Chambers mentioned Duncan's  `R Gnumeric' interface --- the only
>    R-related answer--- at http://www.omegahat.org/RGnumeric/  
>    It is fairly new, but I'll definitely am looking at it, as well.

>>>>> "MM" == Martin Maechler <maechler@stat.math.ethz.ch> writes:

MM> A colleague has a dozen of excel sheets and also expects to get
MM> updates regularly.  He could open these in M$-Excel and export as
MM> *.csv manually, "bring back to Unix" and then read into R.  Of
MM> course there must be options to start programming this in something
MM> like visual basic, but we wouldn't to really want to...

MM> We also know that probably Gnumeric could do the job (since the xls
MM> files are said to be simple ...), but as far as I know that also
MM> does only work manually (and via GUI).  "gnumeric --help" does not
MM> suggest a command line version.  Are there other options (i.e. free
MM> software programs) to use from the command line (i.e. also in a
MM> shell script or from inside R)?

MM> Since this is not really an R topic, please reply to my e-mail
MM> only, not to R-help, and I'll summarize.  Thanks in advance!

> Thanks a lot, once more!

Martin,

One remark.  If we think we want to be able to import Excel data
directly into R then neither solution will do I think: the xls2csv
program from (2) is not good enough, and the other solutions require
tools that may not be available.  (Effort is currently put into removing
run-time and even build-time dependency on Perl and shell tools, and use
R scripts instead.)

I think it would be nice if package `foreign' had at least limited
support for reading in .xls files ...

-k
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._