[R] Data type problem when extract data from SQLite to R by using RSQLite

chen jia chen_1002 at fisher.osu.edu
Mon Feb 28 21:37:42 CET 2011


Hi there,

When I extract data from SQLite to R, the data types (or modes) of the
extracted data seems to be determined by the value of the first row.
Please see the following example.

When I put the missing values first, the column extracted is of the
mode character.
> str(dbGetQuery(sql.industry,
+                "select pya_var from annual_data3
+                 order by pya_var"))
'data.frame':   155465 obs. of  1 variable:
 $ pya_var: chr  NA NA NA NA ...

When I put the non-missing values first, the column extracted is of
the mode numeric.
> str(dbGetQuery(sql.industry,
+                "select pya_var from annual_data3
+                 order by pya_var desc"))
'data.frame':   155465 obs. of  1 variable:
 $ pya_var: num  2.23 2.08 2.04 2 1.86 ...

And, if the missing value happens to be the first observation, the
whole column is converted to the mode character.
> str(dbGetQuery(sql.industry,
+                "select pya_var from annual_data3"))
'data.frame':   155465 obs. of  1 variable:
 $ pya_var: chr  NA "0.0" "0.000532137017747267" "0.00655147489334259" ...

This creates a problem because this column "pya_var" is supposed to be
numeric. If it is converted to character, I will encounter errors in
subsequent computation. Although I can convert the data back to
numeric, but it will be a hassle to check whether it is necessary and
which columns to convert every time I extract data from SQLite into R.

Do you have any idea how I can make sure the data type stays numeric
regardless what value the first observation is? Thanks.

Best,
Jia

-- 
700 Fisher Hall
2100 Neil Ave.
Columbus, Ohio  43210
http://www.fisher.osu.edu/~chen_1002/



More information about the R-help mailing list