[R] Yet another question about importing Excel *.xls files

Gabor Grothendieck ggrothendieck at gmail.com
Wed Jan 20 22:25:42 CET 2010


Try this:

U <- "http://www.mda.state.mn.us/en/sitecore/content/Global/MDADocs/licensing/map/mapreportform.aspx"
library(gdata)
DF <- read.xls(U, pattern = "SAMPLE", as.is = TRUE)

and now write an R program to create the desired data frame from DF.


On Wed, Jan 20, 2010 at 3:56 PM, Jerry Floren <jerry.floren at state.mn.us> wrote:
>
> Hi Gabor,
>
> Thanks for your insights and suggestions. There was a post on the Wiki you
> mentioned that makes me think this will work.
>
> Unfortunately, the spreadsheet was designed to make it easy for lab staff to
> enter their results, but not so easy for another program to read in the
> data. That is why I copy and paste the original spreadsheet into a second
> spreadsheet with a separate page that could be read beginning at the top
> (A1) cell. Here is a link to the spreadsheet the labs use.
>
> http://www.mda.state.mn.us/en/sitecore/content/Global/MDADocs/licensing/map/mapreportform.aspx
>
> A big problem with this format is that the lab's ID code ends up in each
> column while the data is organized by rows. Also, there are three different
> samples with three different replicates. The spreadsheet is set up so that
> the replicates are not identified. Even if the data had to be copied to a
> second spreadsheet (with the conversion page), it would be a significant
> improvement over the current method. I just learned that it is possible to
> hide a worksheet in Excel. (I do not want the lab staff to see the hidden
> worksheet as then the replicates can be identified).
>
> The spreadsheets are e-mailed to me. I have to open them up and save them
> with a different file name. If the first worksheet is hidden, but has the
> data arranged properly starting in cell A1, it looks like the entire(hidden)
> worksheet be read in according to this from the Wiki:
>
> "Entire Worksheets
> Use the following code to import in all of worksheet called “Sheet 1”. The
> hazard with this approach is that any and all data in that worksheet will be
> copied in, this includes data that are hidden or that you otherwise were not
> intending to bring in.
>
> library(RODBC)
>
> MyExcelData <- sqlFetch(odbcConnectExcel("Test.xls"),
>                        sqtable = "Sheet1", na.strings = "NA", as.is = T)
> odbcCloseAll()Caution
> Excel 2003 (and earlier?) use the first 0-16 rows to guess the data type.
> Consider a column of international postal codes where the first 20 rows
> contain 50010 and the next two rows contain 500A1 and 500E1. The value of
> ‘500A1’ is likely to be interpreted as a missing value and the value of
> ‘500E1’ may be interpreted as a numeric value that is in exponential format.
> More information can be found here:
> http://www.dicks-blog.com/archives/2004/06/03/external-data-mixed-data-types/.
> "
>
> I'm going to play around with this and see if I can get it to work for my
> data.
>
> Thanks,
>
> Jerry Floren
> Minnesota Department of Agriculture
>
> --
> View this message in context: http://n4.nabble.com/Yet-another-question-about-importing-Excel-xls-files-tp1018699p1018743.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list