[R] Yet another question about importing Excel *.xls files

Jerry Floren jerry.floren at state.mn.us
Wed Jan 20 21:56:09 CET 2010


Hi Gabor,

Thanks for your insights and suggestions. There was a post on the Wiki you
mentioned that makes me think this will work. 

Unfortunately, the spreadsheet was designed to make it easy for lab staff to
enter their results, but not so easy for another program to read in the
data. That is why I copy and paste the original spreadsheet into a second
spreadsheet with a separate page that could be read beginning at the top
(A1) cell. Here is a link to the spreadsheet the labs use.

http://www.mda.state.mn.us/en/sitecore/content/Global/MDADocs/licensing/map/mapreportform.aspx 

A big problem with this format is that the lab's ID code ends up in each
column while the data is organized by rows. Also, there are three different
samples with three different replicates. The spreadsheet is set up so that
the replicates are not identified. Even if the data had to be copied to a
second spreadsheet (with the conversion page), it would be a significant
improvement over the current method. I just learned that it is possible to
hide a worksheet in Excel. (I do not want the lab staff to see the hidden
worksheet as then the replicates can be identified).

The spreadsheets are e-mailed to me. I have to open them up and save them
with a different file name. If the first worksheet is hidden, but has the
data arranged properly starting in cell A1, it looks like the entire(hidden)
worksheet be read in according to this from the Wiki:

"Entire Worksheets
Use the following code to import in all of worksheet called “Sheet 1”. The
hazard with this approach is that any and all data in that worksheet will be
copied in, this includes data that are hidden or that you otherwise were not
intending to bring in. 

library(RODBC)
 
MyExcelData <- sqlFetch(odbcConnectExcel("Test.xls"), 
                        sqtable = "Sheet1", na.strings = "NA", as.is = T)
odbcCloseAll()Caution
Excel 2003 (and earlier?) use the first 0-16 rows to guess the data type.
Consider a column of international postal codes where the first 20 rows
contain 50010 and the next two rows contain 500A1 and 500E1. The value of
‘500A1’ is likely to be interpreted as a missing value and the value of
‘500E1’ may be interpreted as a numeric value that is in exponential format.
More information can be found here:
http://www.dicks-blog.com/archives/2004/06/03/external-data-mixed-data-types/.
"

I'm going to play around with this and see if I can get it to work for my
data.

Thanks,

Jerry Floren
Minnesota Department of Agriculture
 
-- 
View this message in context: http://n4.nabble.com/Yet-another-question-about-importing-Excel-xls-files-tp1018699p1018743.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list