[R] Workaround for RODBC asymmetric numeric data treatment

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Oct 3 22:02:13 CEST 2014


On 03/10/2014 20:18, Rui Barradas wrote:
> Hello,
>
> Inline
>
> Em 03-10-2014 19:04, Bos, Roger escreveu:
>> Andrew,
>>
>> I ran your code using my SQL Server database and it seems like it
>> worked okay for me, in that I end up with "num" data types when I read
>> the data back in.  So it may be a setting on your database.  I don't
>> claim to know which one.
>>
>> BTW, I had to install 5 or 6 separate packages to get fPortfolio to
>> load.  Anyone know why
>> install.packages("fPortfolio",repos="http://R-Forge.R-project.org")
>> can't install all the dependencies automatically?
>
> Try
>
> install.packages(..., dependencies = TRUE)

You don't need that unless you want all the Suggests recursively.

The problem is that he set repos to just one repoository, rather than 
using setRepositories().  I guess some of the unstated dependencies are 
on CRAN not R-forge.  Let's see:

 > setRepositories(graphics=FALSE)
--- Please select repositories for use in this session ---


1: + CRAN
2:   BioC software
3:   BioC annotation
4:   BioC experiment
5:   BioC extra
6:   CRAN (extras)
7:   Omegahat
8:   R-Forge
9:   rforge.net

Enter one or more numbers separated by spaces, or an empty line to cancel
1: 1 8
 > install.packages('fPortfolio')
Installing package into ‘/Users/ripley/R/Library’
(as ‘lib’ is unspecified)
also installing the dependencies ‘cubature’, ‘mvtnorm’, ‘mnormt’, 
‘numDeriv’, ‘bitops’, ‘stabledist’, ‘gss’, ‘fMultivar’, ‘sn’, 
‘DEoptimR’, ‘truncnorm’, ‘XMLRPC’, ‘RCurl’, ‘XML’, ‘timeDate’, 
‘timeSeries’, ‘fBasics’, ‘fAssets’, ‘fCopulae’, ‘robustbase’, ‘Rglpk’, 
‘slam’, ‘Rsymphony’, ‘Rsolnp’, ‘kernlab’, ‘quadprog’, ‘rneos’
....

>
> Hope this helps,
>
> Rui Barradas
>>
>> Thanks,
>>
>> Roger
>>
>>
>>> library(RODBC)
>>
>>> library(fPortfolio)
>> Loading required package: timeSeries
>>
>> Attaching package: ‘timeSeries’
>>
>> The following object is masked from ‘package:zoo’:
>>
>>      time<-
>>
>> Loading required package: fBasics
>>
>>
>> Rmetrics Package fBasics
>> Analysing Markets and calculating Basic Statistics
>> Copyright (C) 2005-2014 Rmetrics Association Zurich
>> Educational Software for Financial Engineering and Computational Science
>> Rmetrics is free software and comes with ABSOLUTELY NO WARRANTY.
>> https://www.rmetrics.org --- Mail to: info at rmetrics.org
>>
>> Attaching package: ‘fBasics’
>>
>> The following object is masked from ‘package:TTR’:
>>
>>      volatility
>>
>> Loading required package: fAssets
>> Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()),
>> versionCheck = vI[[j]]) :
>>    there is no package called ‘DEoptimR’
>> Error: package ‘fAssets’ could not be loaded
>>> source(.trPaths[5], echo=TRUE, max.deparse.length=150)
>>
>>> library(RODBC)
>>
>>> library(fPortfolio)
>> Loading required package: fAssets
>> Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck =
>> vI[[i]]) :
>>    there is no package called ‘sn’
>> Error: package ‘fAssets’ could not be loaded
>>> library(RODBC)
>>> library(fPortfolio)
>> Loading required package: fAssets
>> Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck =
>> vI[[i]]) :
>>    there is no package called ‘sn’
>> Error: package ‘fAssets’ could not be loaded
>>> library(timeSeries)
>>> head(SWX.RET$SBI)
>> Error in head(SWX.RET$SBI) :
>>    error in evaluating the argument 'x' in selecting a method for
>> function 'head': Error: object 'SWX.RET' not found
>>> library(fPortfolio)
>> Loading required package: fAssets
>> Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck =
>> vI[[i]]) :
>>    there is no package called ‘sn’
>> Error: package ‘fAssets’ could not be loaded
>>> library(fPortfolio)
>> Loading required package: fAssets
>>
>>
>> Rmetrics Package fAssets
>> Analysing and Modeling Financial Assets
>> Copyright (C) 2005-2014 Rmetrics Association Zurich
>> Educational Software for Financial Engineering and Computational Science
>> Rmetrics is free software and comes with ABSOLUTELY NO WARRANTY.
>> https://www.rmetrics.org --- Mail to: info at rmetrics.org
>> Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck =
>> vI[[i]]) :
>>    there is no package called ‘slam’
>> Error: package or namespace load failed for ‘fPortfolio’
>>> library(fPortfolio)
>> Package Rsolnp (1.14) loaded.  To cite, see citation("Rsolnp")
>>
>> Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck =
>> vI[[i]]) :
>>    there is no package called ‘kernlab’
>> Error: package or namespace load failed for ‘fPortfolio’
>>> library(fPortfolio)
>> Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck =
>> vI[[i]]) :
>>    there is no package called ‘rneos’
>> Error: package or namespace load failed for ‘fPortfolio’
>>> library(fPortfolio)
>>
>>
>> Rmetrics Package fPortfolio
>> Portfolio Optimization
>> Copyright (C) 2005-2014 Rmetrics Association Zurich
>> Educational Software for Financial Engineering and Computational Science
>> Rmetrics is free software and comes with ABSOLUTELY NO WARRANTY.
>> https://www.rmetrics.org --- Mail to: info at rmetrics.org
>>> head(SWX.RET$SBI)
>> [1] -0.0020881194 -0.0001045205 -0.0013597617  0.0004185852
>> 0.0000000000 -0.0010467917
>>> str(SWX.RET$SBI)
>>   num [1:1916] -0.002088 -0.000105 -0.00136 0.000419 0 ...
>>> source(.trPaths[5], echo=TRUE, max.deparse.length=150)
>>
>>> mydata<-as.timeSeries(SWX.RET)
>>
>>> head(mydata)
>> GMT
>>                       SBI          SPI            SII
>> LP25         LP40         LP60
>> 2000-01-04 -0.0020881194 -0.034390059  0.00001367381 -0.011994298
>> -0.018013035 -0.026155259
>> 2000-01-05 -0.0001045205 -0.010408271 -0.00495530624 -0.003657054
>> -0.005837489 -0.009011403
>> 2000-01-06 -0.0013597617  0.012119128  0.00381289851 -0.001323897
>> -0.001644737 -0.002395959
>> 2000-01-07  0.0004185852  0.022461656 -0.00061621046  0.007714991
>> 0.011660151  0.017062613
>> 2000-01-10  0.0000000000  0.002107677  0.00238057889  0.003029081
>> 0.004565523  0.006948020
>> 2000-01-11 -0.0010467917 -0.002773654 -0.00029384531 -0.002422531
>> -0.003142903 -0.004183466
>>> source(.trPaths[5], echo=TRUE, max.deparse.length=150)
>>
>>> df2beSavedByRODBC =as.data.frame(mydata)
>>
>>> str(df2beSavedByRODBC)
>> 'data.frame':   1916 obs. of  6 variables:
>>   $ SBI : num  -0.002088 -0.000105 -0.00136 0.000419 0 ...
>>   $ SPI : num  -0.03439 -0.01041 0.01212 0.02246 0.00211 ...
>>   $ SII : num  0.0000137 -0.0049553 0.0038129 -0.0006162 0.0023806 ...
>>   $ LP25: num  -0.01199 -0.00366 -0.00132 0.00771 0.00303 ...
>>   $ LP40: num  -0.01801 -0.00584 -0.00164 0.01166 0.00457 ...
>>   $ LP60: num  -0.02616 -0.00901 -0.0024 0.01706 0.00695 ...
>>> sqlSave(xf,
>>> dat=df2beSavedByRODBC,tablename="testTable",rownames=TRUE,append=FALSE,addPK=FALSE,verbose=FALSE)
>>>
>>> sqlString = "select * from testTable"
>>> dataFrameFromDB = sqlQuery(xf, sqlString,errors=TRUE);
>>> str(dataFrameFromDB)
>> 'data.frame':   1916 obs. of  7 variables:
>>   $ rownames: chr  "2000-01-04" "2000-01-05" "2000-01-06" "2000-01-07"
>> ...
>>   $ SBI     : num  -0.002088 -0.000105 -0.00136 0.000419 0 ...
>>   $ SPI     : num  -0.03439 -0.01041 0.01212 0.02246 0.00211 ...
>>   $ SII     : num  0.0000137 -0.0049553 0.0038129 -0.0006162 0.0023806
>> ...
>>   $ LP25    : num  -0.01199 -0.00366 -0.00132 0.00771 0.00303 ...
>>   $ LP40    : num  -0.01801 -0.00584 -0.00164 0.01166 0.00457 ...
>>   $ LP60    : num  -0.02616 -0.00901 -0.0024 0.01706 0.00695 ...
>>
>>
>>
>>
>> ***************************************************************
>> This message and any attachments are for the named person's use only.
>> This message may contain confidential, proprietary or legally privileged
>> information. No right to confidential or privileged treatment
>> of this message is waived or lost by an error in transmission.
>> If you have received this message in error, please immediately
>> notify the sender by e-mail, delete the message, any attachments and all
>> copies from your system and destroy any hard copies. You must
>> not, directly or indirectly, use, disclose, distribute,
>> print or copy any part of this message or any attachments if you are not
>> the intended recipient.
>>
>>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org
>> [mailto:r-help-bounces at r-project.org] On Behalf Of Andrew
>> Sent: Friday, October 03, 2014 1:40 PM
>> To: r-help at r-project.org
>> Subject: [R] Workaround for RODBC asymmetric numeric data treatment
>>
>> Note: I did raise report the issue below to   r-sig-db at r-project.org,
>> but didn't see any reply.
>> I'm hoping somebody on r-help can help me devise a workaround for a
>> problem I'm having with RODB:
>>
>>
>> I use RODBC to read and write a good deal of data to SQL Server and
>> I'd be extremely grateful
>> if anyone has found a workaround in order to be able to write
>> dataframes to SQL Server
>> using RODBC dynamically created SQL tables and read the data from
>> those tables, or indeed any
>> arbitrary SQL Server table with "float" datatypes and end up with
>> numeric columns instead of "factor" columns
>> in a dataframe in R.
>>
>>
>> I have found that when RODBC creates a Microsoft SQL Server data table
>> from a dataFrame using sqlSave(....append=FALSE),
>> RODBC uses the SQL "float" datatype to store R numeric data in a
>> dynamically-created table on the server.
>>
>> However, when RODBC reads any SQL Server "float" datatype from SQL
>> Server via sqlQuery it interprets float columns as "factor" data.
>>
>>
>> I created a standalone sample below to demonstrate the odd behavior of
>> RODBC that I hope to overcome:
>>
>> # Assuming the reader has access to SQL Server the code below is
>> self-contained and repeatable
>>
>> # I believe it demonstrates unexpected and undesirable behavior in RODBC
>>
>>
>> library(RODBC)
>> library(fPortfolio)
>> library(timeSeries)
>> head(SWX.RET$SBI)
>> str(SWX.RET$SBI)
>> mydata<-as.timeSeries(SWX.RET)
>> head(mydata)
>>
>> df2beSavedByRODBC =as.data.frame(mydata)
>>
>> str(df2beSavedByRODBC)
>>
>> # shows the numeric data in the dataframe
>> #
>> # data.frame':  1916 obs. of  6 variables:
>> #   $ SBI : num  -0.002088 -0.000105 -0.00136 0.000419 0 ...
>> # $ SPI : num  -0.03439 -0.01041 0.01212 0.02246 0.00211 ...
>> ...
>>
>>
>> # Let's save the dataframe to SQL Server:
>>
>> dbconn<-odbcDriverConnect(connection="Driver={SQL
>> Server};server=_YOURSERVERNAMEHER_;database=_YOURDBNAME_;Trusted_Connection=True;");
>>
>> sqlSave(channel=dbconn,dat=df2beSavedByRODBC,tablename="testTable",rownames=TRUE,append=FALSE,addPK=FALSE,verbose=FALSE)
>>
>>
>> # The sqlSave above works very well.  The new table is create in the
>> Microsoft SQL database and the ddl for the table is:
>> #
>> #     [dbo].[testTable](
>> #       [rownames] [varchar](255) NULL,
>> #     [SBI] [float] NULL,
>> #     [SPI] [float] NULL,
>> #     [SII] [float] NULL,
>> #     [LP25] [float] NULL,
>> #     [LP40] [float] NULL,
>> #     [LP60] [float] NULL
>> #     )
>>
>>
>> # The numeric values from the dataframe are stored as float (i.e.
>> numeric) in SQL server -- good!
>>
>> ## now let's read back the data RODBC stored in SQL server from a SQL
>> table RODBC created:
>>
>>
>> sqlString = "select * from testTable"
>>
>>
>> dataFrameFromDB = sqlQuery(dbconn,sqlString,errors=TRUE);
>>
>> str(dataFrameFromDB)
>>
>> #
>> # 'data.frame':  1916 obs. of  7 variables:
>> # $ rownames: Factor w/ 1916 levels "2000-01-04","2000-01-05",..: 1 2
>> 3 4 5 6 7 8 9 10 ...
>> # $ SBI     : Factor w/ 1742 levels "-0.00041080415489958",..: 349 42
>> 161 1418 828 48 49 1419 1024 135 ...
>> # $ SPI     : Factor w/ 1848 levels "-0.0020169904194276",..: 445 48
>> 970 883 1187 377 1157 1065 951 1840 ...
>> ...
>>
>> #*********   RODBC wrote numeric data to SQL Server as float, but read
>> the same data back as Factor !  ********
>>
>>
>> I could use some help to create a robust and flexible workaround for
>> RODBC's asymmetric treatment of numeric data.
>> If there were some way to force RODBC sqlQuery to interpret all SQL
>> Server float datatypes as numeric my problem would be solved.
>> FWIW:  RODBC does interpret the SQL Server "real" datatype as numeric.
>>
>>
>> Thank you,
>>
>> Andrew
>>
>>          [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford
1 South Parks Road, Oxford OX1 3TG, UK



More information about the R-help mailing list