[R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column

Maciej Hoffman-Wecker Maciej.Hoffman-Wecker at bioskin.de
Tue Aug 14 13:33:27 CEST 2007


Dear Professor Ripley,

Thank you very much for your response. I send the problem, as I didn't have any more ideas were to search for the reason. I didn't say this is a R bug, knowing the responses on such mails.-) 

But I succeeded in developing a tiny example, that reproduces the bug (wherever it is).

I generated a small Access data base "test2.mdb" with one table "Tab1" and following columns:

"Field name"	"Field type"
          F1            Number
          NO            Number
          F2            Number

(sorry if the Access identifiers are not the standard ones, as I have a german Access version)

The content of the "Tab1" table is:

F1	NO	F2
1	1	1
2	2	2
0		1
1	0	0

(The column "NO" contains one missing)

Now if I import the data into R, I get the following results:

> library(RODBC)
> .con <- odbcConnectAccess("./test2.mdb")
> (.d <- try(sqlQuery(.con, "select * from Tab1")))
  F1 NO F2
1  1  1  1
2  2  2  2
3  0 NA  1
4  1  0  0
> (.d <- try(sqlQuery(.con, "select F1 , NO , F2 from Tab1")))
  F1 Expr1001 F2
1  1        0  1
2  2        0  2
3  0        0  1
4  1        0  0
> close(.con)

So the problem occurs if the column names are specified within the query.
Is the query "select F1 , NO , F2 from Tab1" invalid?

Regarding the memory issue, I _knew_ that there must be a reason for the running out of memory space. Sorry for not being more specific. My question than is:

Is there a way to 'reset' the environment without quitting R and restarting it?

Thank you for your help. 

Kind regards,
Maciej


-----Ursprüngliche Nachricht-----
Von: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] 
Gesendet: Dienstag, 14. August 2007 11:51
An: Maciej Hoffman-Wecker
Cc: r-help at stat.math.ethz.ch
Betreff: Re: [R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column

On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:

>
> Dear all,
>
> I have some problems with importing data from an Access data base via 
> RODBC to R. The data base contains several tables, which all are 
> imported consecutively. One table has a column with column name "NO". 
> If I run the code attached on the bottom of the mail I get no 
> complain, but the column name (name of the respective vector of the 
> data.frame) is "Expr1014" instead of "NO". Additionally the original 
> column (type
> "text") containes "0"s and missings, but the imported column contains 
> "0"s only (type "int"). If I change the column name in the Access data 
> base to "NOx", the import works fine with the right name and the same 
> data.
>
> Previously I generated a tiny Access data base which reproduced the 
> problem. To be on the safe site I installed the latest version (2.5.1) 
> and now the example works fine, but within my production process the 
> error still remaines. An import into excel via ODBC works fine.
>
> So there is no way to figure it out whether this is a bug or a
> feature.-)

It's most likely an ODBC issue, but you have not provided a reproducible example.

> The second problem I have is that when I rerun "rm(list = ls(all = 
> T)); gc()" and the import several times I get the following error:
>
> Error in odbcTables(channel) : Calloc could not allocate (263168 of 1) 
> memory In addition: Warning messages:
> 1: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcQuery(channel, query, rows_at_time)
> 2: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcQuery(channel, query, rows_at_time)
> 3: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcTables(channel)
> 4: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcTables(channel)
>
> which is surprising to me, as the first two statements should delete 
> all

How do you _know _what they 'should' do?  That only deletes all objects in the workspace, not all objects in R, and not all memory blocks used by R.

Please do read ?"Memory-limits" for the possible reasons.

Where did '447Mb' come from?  If this machine has less than 2Gb of RAM, buy some more.


> objects and recover the memory. Is this only a matter of memory? Is
> there any logging that reduces the memory? Or is this issue connected to
> the upper problem?
>
> I added the code on the bottom - maybe there is some kind of misuse I
> lost sight of. Any hints are appreciated.
>
> Kind regards,
> Maciej
>
>> version
>               _
> platform       i386-pc-mingw32
> arch           i386
> os             mingw32
> system         i386, mingw32
> status
> major          2
> minor          5.1
> year           2007
> month          06
> day            27
> svn rev        42083
> language       R
> version.string R version 2.5.1 (2007-06-27)
>
>
> ## code
>
> get.table <- function(name, db, drop = NULL){
>  .con <- try(odbcConnectAccess(db), silent = T)
>  if(!inherits(.con, "RODBC")) return(.con)
>  ## exclude memo columns
>  .t <- try(sqlColumns(.con, name))
>  if(inherits(.t, "try-error")){close(.con); return(.t)}
>  .t <- .t[.t$"COLUMN_SIZE" < 255, "COLUMN_NAME"]
>  .t <- paste(.t, collapse = ",")
>  ## get table
>  .t <- paste("select", .t, "from", name)
>  .d <- try(sqlQuery(.con, .t), silent = T)
>  if(inherits(.d, "try-error")){close(.con); return(.d)}
>  .con <- try(close(.con), silent = T)
>  if(inherits(.con, "try-error")) return(.con)
>  .d <- .d[!names(.d) %in% drop]
>  return(.d)
> }
>
> get.alltables <- function(db){
>  .con <- try(odbcConnectAccess(db), silent = T)
>  if(!inherits(.con, "RODBC")) return(.con)
>  .tbls <- try(sqlTables(.con)[["TABLE_NAME"]])
>  if(inherits(.tbls, "try-error")){close(.con); return(.tbls)}
>  .con <- try(close(.con), silent = T)
>  if(inherits(.con, "try-error")) return(.con)
>  .tbls <- .tbls[-grep("^MSys", .tbls)]
>  .d <- lapply(seq(along = .tbls), function(.i){
>    .d <-
>      try(get.table(.tbls[.i], db = db))
>    return(invisible(.d))
>  })
>  names(.d) <- .tbls
>  .ok <- !sapply(.d, inherits, "try-error")
>  return(list(notdone = .d[!.ok], data = .d[.ok]))
> }
>
> library(RODBC)
>
> alldata <- get.alltables(db = "./myaccessdb.MDB")
>
> ## code end
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list