[R] RODBC, sqlQuery with NA:s

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Apr 3 10:39:18 CEST 2007


On Tue, 3 Apr 2007, Lauri Nikkinen wrote:

> Thanks Prof Ripley for your answer. In fact, there is no NULL's in these
> varchar variables, that is because one of the software's we use do not
> accept NULL's (that is strange, isn't it?).

It's not uncommon.  (Be grateful you are not on MacOS, where it seems 
quite common to accept NULLs but not treat them as different from "".)

> So, for instance, when I type
>
> select * from table1
> where var1=''
>
> I get all those rows where the cells of var1 are empty. So, is there any
> possibility to tell R that it should convert these empty cells to NA:s (when
> importing)?

Yes, via na.strings="".

>
> Thanks for your help
> Lauri
>
>
> 2007/4/3, Prof Brian Ripley <ripley at stats.ox.ac.uk>:
>> 
>> In my attempts to reproduce this both approaches worked.
>> 
>> There seems to be something going on we cannot see (and I don't believe
>> NULL in the database is being mapped to " ").  For example, might there be
>> non-breaking spaces (character A0 in Windows 1252, your most likely
>> character set) in the database?
>> 
>> On Mon, 2 Apr 2007, Lauri Nikkinen wrote:
>> 
>> > Hi R-users,
>> >
>> > I'm trying to retrieve data from MS SQL database with RODBC's
>> > sqlQuery-function:
>> >
>> > temp <- sqlQuery(channel,"select *, (select text from table1 where
>> > koodi='paa' and koodi2=paa) as tempor from table2")
>> >
>> > str(temp)
>> >
>> > ?
>> > $ var0        : num  NA NA 1.6 NA NA 1.4 NA 1 NA NA ...
>> > $ var1          : Factor w/ 45 levels " ","01","01;07",..: 1 1 1 4 38 1
>> 38 7
>> > 1 1 ...
>> > $ var2        : Factor w/ 95 levels " ","01","01;02;09",..: 1 1 1 9 72 1
>> 78
>> > 13 1 1 ...
>> > $ var3      : Factor w/ 104 levels " ","01","01;02;09",..: 1 1 1 44 82 1
>> 87
>> > 14 1 1 ...
>> > ?
>> >
>> > Data types of these variables in the MS SQL database are:
>> >
>> > var0: float, NULL
>> > var1: varchar, NULL
>> > var2: varchar, NULL
>> > var3: varchar, NULL
>> >
>> > As you can see from above, I get variable var0 with NA:s because of the
>> data
>> > type (there is NULL in the database if there is no observation in the
>> > specific cell). How can I get NA:s to these Factor variables also? I
>> have
>> > tried the following with no positive results:
>> >
>> > temp <- sqlQuery(channel,"select *, (select text from table1 where
>> > koodi='paa' and koodi2=paa) as tempor from table2", na.strings=" ")
>> >
>> > and also afterwards
>> >
>> > temp$var1[temp$var1 == " "] <- "NA"
>> >
>> > Regards,
>> > Lauri
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> >
>> 
>> --
>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list