[R] exporting tables from an access database using parallel foreach

Rainer Hurling rhurlin at gwdg.de
Sun Nov 22 07:43:01 CET 2015


Am 22.11.15 um 02:38 schrieb John McKown:
> On Sat, Nov 21, 2015 at 11:55 AM, Vivek Sutradhara <viveksutra at gmail.com>
> wrote:
> 
>> Hi John and Jeff,
>> Thanks a lot for your help. I agree that row numbers are not a standard
>> feature in SQL. What I am looking for is some kind of a hack. After all,
>> the sqlFetch command is able to return a specific number of rows. And the
>> sqlFetchMore command is able to take up the baton from that row onwards to
>> futher return rows corresponding to the max parameter.
>>
>> I wonder if it is possible to straight away hop to a certain row number
>> (without going through sqlfetch and sqlFetchMore and without loading any
>> data into memory) and then return the contents corresponding to a certain
>> number of rows. The question is : is there a "catch" for accessing a row
>> location, and what could be the "hook" for that? I am interested in the the
>> recent updated rows to a table after a certain date. Is it possible to
>> identify them in a quick way? Running sql queries on such large tables
>> appears to take too long a time.
>>
>>  I understand that there is no provision to do this by available methods.
>> But, is it possible to get under the hood and find some hack?
>>
> 
> ​Now you're talking about the internals of Microsoft Access. And you're
> _way_ beyond my knowledge. Is there such knowledge? I sure there is. But,
> unfortunately, once you get into that depth, you can get into real trouble
> when (not if) MS decides to change the internals out from under you without
> any warning at all. If you are really needing this, try looking the the
> "MDB Tools" software at either https://github.com/brianb/mdbtools or
> http://mdbtools.sourceforge.net/​ I don't think this does exactly what you
> want, but it may give you the information you need to read the MDB file
> yourself directly in R code. <shudder/>

The mdb-tools give direct access to the physical mdb files, lying around
in a filesystem. A database file xxx.mdb has not to be 'active' within a
MS Access 'Server', to read in its contents via mdb-tools. The idea
behind is, that one should be able to read the contents of mdb files,
even when there is no MS Access you can connect to and/or when no
Windows installation is running.

In my knowledge, mdb-tools is not available for Windows platforms, only
for Unix alikes and Linux, maybe OSX.

The R package 'Hmisc' is able to use mdb-tools, if they are also present
on that system.

Unfortunately, it seems, that mdb-tools also has no direct way to select
rows by their number.

> 
> ​What you would really want is something like the ROWID in SQLite. That is
> a "system" maintained column in every table in SQLite. It is a 64-bit
> unique number. Basically, it starts at 1 and increments every time you add
> a new row.
> 
> What would be "best", IMO, would be if you could alter your Access database
> to have a "serial" column which would be your "row number" You could then
> get "directly" there by using a SELECT similar to:
> 
> SELECT * FROM table WHERE serial BETWEEN (first-row,last-row)
> 
> 
> 
>>
>> Jeff, I will take your suggestion and try my luck at the R-sig-db mailing
>> list.
>> Thanks,
>> Vivek
>>
>>
>>



More information about the R-help mailing list