[R] Yet another question about importing Excel *.xls files

David Scott d.scott at auckland.ac.nz
Wed Jan 20 23:38:07 CET 2010


Jerry Floren wrote:
> Hi Gabor,
> 
> Thanks for your insights and suggestions. There was a post on the Wiki you
> mentioned that makes me think this will work. 
> 
> Unfortunately, the spreadsheet was designed to make it easy for lab staff to
> enter their results, but not so easy for another program to read in the
> data. That is why I copy and paste the original spreadsheet into a second
> spreadsheet with a separate page that could be read beginning at the top
> (A1) cell. Here is a link to the spreadsheet the labs use.
> 
> http://www.mda.state.mn.us/en/sitecore/content/Global/MDADocs/licensing/map/mapreportform.aspx 
> 
> A big problem with this format is that the lab's ID code ends up in each
> column while the data is organized by rows. Also, there are three different
> samples with three different replicates. The spreadsheet is set up so that
> the replicates are not identified. Even if the data had to be copied to a
> second spreadsheet (with the conversion page), it would be a significant
> improvement over the current method. I just learned that it is possible to
> hide a worksheet in Excel. (I do not want the lab staff to see the hidden
> worksheet as then the replicates can be identified).
> 
> The spreadsheets are e-mailed to me. I have to open them up and save them
> with a different file name. If the first worksheet is hidden, but has the
> data arranged properly starting in cell A1, it looks like the entire(hidden)
> worksheet be read in according to this from the Wiki:
> 
> "Entire Worksheets
> Use the following code to import in all of worksheet called “Sheet 1”. The
> hazard with this approach is that any and all data in that worksheet will be
> copied in, this includes data that are hidden or that you otherwise were not
> intending to bring in. 
> 
> library(RODBC)
>  
> MyExcelData <- sqlFetch(odbcConnectExcel("Test.xls"), 
>                         sqtable = "Sheet1", na.strings = "NA", as.is = T)
> odbcCloseAll()Caution
> Excel 2003 (and earlier?) use the first 0-16 rows to guess the data type.
> Consider a column of international postal codes where the first 20 rows
> contain 50010 and the next two rows contain 500A1 and 500E1. The value of
> ‘500A1’ is likely to be interpreted as a missing value and the value of
> ‘500E1’ may be interpreted as a numeric value that is in exponential format.
> More information can be found here:
> http://www.dicks-blog.com/archives/2004/06/03/external-data-mixed-data-types/.
> "
> 
> I'm going to play around with this and see if I can get it to work for my
> data.
> 
> Thanks,
> 
> Jerry Floren
> Minnesota Department of Agriculture
>  

As an alternative to Gabor's suggestions, I have always had excellent 
results from xlsReadWrite which is back on CRAN.

The author, Hans-Peter Suter, also offers xlsReadWritePro at 
http://www.swissr.org/ which can write native xls files with multiple 
sheets.

David Scott


-- 
_________________________________________________________________
David Scott	Department of Statistics
		The University of Auckland, PB 92019
		Auckland 1142,    NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:	d.scott at auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics



More information about the R-help mailing list