[R] RODBC and NULL values

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Oct 17 12:18:00 CEST 2006


What sqltype(s) are your variables?

For numeric types, RODBC merely maps values the ODBC driver says are NULL 
to NA.  Since you appear not to have character data,

nullstring: character string to be used when reading 'SQL_NULL_DATA'
           character items from the database.

na.strings: character string(s) to be mapped to 'NA' when reading
           character data.

are not relevant to you.

At least on Windows and Linux the PostgreSQL 8.1 ODBC driver works 
correctly, and NULLs in numeric columns are mapped to NAs in R.  (There is 
an example in my test suite.)

On Tue, 17 Oct 2006, Mark Wardle wrote:

> Dear All,
>
> Writing sooner than I thought I'd need to.
>
> I'm using R 2.4 on Mac OS X, with RODBC, PostgreSQL 8.1 and Actual's
> ODBC driver. I have all my data in Filemaker 8.5, but it is
> automatically exported into PostgreSQL for analysis as Filemaker's ODBC
> and JDBC access is awful, slow and has a tendency to crash.
>
> I have disability data where for each patient there is a survival time
> in years from disease onset to a particular disease stage, namely
> unilateral support, bilateral support, wheelchair use, and death. Valid
> values may include NULL (patient hasn't reached that stage), 0 (for
> example, patient needed support immediately at disease onset), and any
> positive integer.
>
> When I query the database manually using psql, it is clear there are
> NULL values.
>        3 |         3 |         18 |       |       27 |        1
>          |           |            |       |       13 |        1
>        1 |         5 |            |       |       10 |        0
>       10 |        13 |         13 |       |       22 |        0

No, it is not clear.  It is clear that there are values which are printed 
as blank or empty strings.

> However, these are all converted to zeros when I use RODBC's sqlQuery(),
> making interpretation impossible. I have tried using the nullstring and
> na.strings options, but these don't seem to have any effect. I have
> tried various combinations of NULL, NA and "". Forgive my awkward SQL.
>
>> channel = odbcConnect("ataxia", uid="mark")
>> disease = sqlQuery(channel, "select calc_survival_unilateral_support
> as unlateral, calc_survival_bilateral_support as bilateral,
> calc_survival_wheelchair as wheelchair,calc_survival_death as death,
> calc_follow_up as followup, has_family_history_ataxia as familial from
> clinical, patient where clinical.patient_fk = patient_id and excluded=0
> and calc_walking_disability_valid=1")
>> disease   # and show results
>
> 127        3         3         18     0       27        1
> 128        0         0          0     0       13        1
> 129        1         5          0     0       10        0
> 130       10        13         13     0       22        0
>
> It doesn't seem to be the old repeating rows NULL bug talked about <a
> href="http://tolstoy.newcastle.edu.au/R/help/04/07/0803.html">here</a>.

That was about R 1.9.1, about a problem solved long before then.  Let's 
not drag up ancient history ....

> Is this because my ODBC driver is not returning the correct values for
> RODBC to parse? Is there anyway of debugging this (the intricacies of
> ODBC are beyond my skill) and is my only alternative to store a
> non-valid number in the database (999?) and use my query or R to remove
> those datapoints afterwards?

Find out what the types involved are.  Perhaps try as.is=FALSE?

> Looking in the archives, there are lots of people asking about how to
> convert NAs to numeric, but I want the NAs passed through unaltered!

Since the mapping of NULLs to NAs works in other examples, I find it hard 
to see how this can be an RODBC issue.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list