[R] RODBC vs gdata

Enrico Schumann enricoschumann at yahoo.de
Mon Jan 9 21:28:01 CET 2012


You could prepare the data in Excel as text, and then coerce them to 
numeric in R (and approriately code your "FG 1" strings).

Depending on how large your file is, you could create a new file, format 
the cells as text, and then copy the data into this new file. Or change 
cell entries to text by prepending a single quote. For instance, "'100" 
(without the outer quotes) would be displayed as "100" in Excel, but not 
be evaluated as a number. (But I have used the latter approach only with 
Excel 2003.)

But as someone else has suggested: if you do not have to use Excel, the 
best thing is not to use it...

Regards,
Enrico

Am 09.01.2012 19:46, schrieb Christof Kluß:
> Hi Enrico,
>
> thank you very much, so it is a known problem with the Microsoft Excel
> ODBC drivers :(
>
> "7 Excel Drivers
> ... There are at least two known problems with reading columns that do
> not have a format set before data entry, and so start with format
> `General'. First, the driver uses the first few rows to determined the
> column type, and is over-fond of declaring `Numeric' even when there are
> non-numeric entries. ... Second, if a column is declared as `Text',
> numeric entries will be read as SQL nulls and hence R NAs.
> Unfortunately, in neither case does reformatting the column help."
>
> So I think I have to use "gdata" to be sure to read all datas.
>
> regards
> Christof
>
>
> Am 09-01-2012 19:29, schrieb Enrico Schumann:
>>
>> Hi Christof,
>>
>> have a look at the manual of RODBC, and in particular the section on
>> Excel drivers.
>>
>> RShowDoc("RODBC", package="RODBC")
>>
>> Regards,
>> Enrico
>>
>>
>> Am 09.01.2012 19:02, schrieb Christof Kluß:
>>> Hi
>>>
>>> one col in my Excel file contains many numbers. But on line 3000 and
>>> some other lines are strings like "FG 1". "RODBS" seems to omit this
>>> lines. "gdata" works, but is much slower.
>>>
>>> Is this a bug of RODBC or do I apply it wrong?
>>>
>>> Example with the same "file.xlsx"
>>>
>>>
>>> library(RODBC); excel<- odbcConnectExcel2007("file.xlsx")
>>> tab<- sqlQuery(excel, 'select * from "Table 1$"'); str(tab)
>>>
>>> col1: num  1 2 3 4 5 6 7 8 9 10 ...
>>>
>>> library(gdata); tab<- read.xls("file.xlsx", sheet=1); str(tab)
>>>
>>> col1: Factor w/ 51 levels "1","10","11",..: 1 12 23 34 41 42 43...
>>>
>>>
>>> greeting
>>> Christof
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
>

-- 
Enrico Schumann
Lucerne, Switzerland
http://nmof.net/



More information about the R-help mailing list