[R] R Function question, (repost to fix the messy work format)

Eric Berger er|cjberger @end|ng |rom gm@||@com
Fri Jul 2 21:56:44 CEST 2021


Definitely doable and you are on the right path and maybe even close.
The error message you got showed your query as having the wrong info after
the 'FROM'
keyword

<SQL> ' SELECT * FROM c("BIODBX.MECCUNIQUE2", "BIODBX.QDATA_HTML_DUMMY",
"BIODBX.SET_ITEMS", "BIODBX.SET_NAMES", "dbo.sysdiagrams",
"GEMD.ASSAY_DEFINITIONS",

etc

i.e you are passing the full vector of table names and not a single one
(the first being the one specified in your working example)

You have to figure out how to correctly set up your loop on this vector of
table names.
The debugger - browser() - is your friend! Learn how to use it, as I
suggested before.



...



On Fri, Jul 2, 2021 at 10:47 PM Kai Yang <yangkai9999 using yahoo.com> wrote:

> Hi Eric,
>
> Thank you spent time to help me for this.
>
>
> Here is the thing: I was requested to manage a sql server for my group.
> the server has many schemas and the tables (>200). I use ODBC to connect
> the server and get the schema name + table name into a data frame.
>
>
> For each of schema + table on server, I need to run a summary report. So I
> wrote a summary script like this:
>
>
> res <- dbGetQuery(con, "SELECT * FROM BIODBX.MECCUNIQUE2")
>
> view(dfSummary(res), file =
> "W:/project/_Joe.B/MSSQL/try/summarytools.BIODBX.MECCUNIQUE2.html")
>
> rm(res)
>
>
> the script works well. but I don't want to write 200+ times of the script
> to summary each table. So, I'm trying to write the function to do this.
> this is my goal.
>
>
> First of all, I'm not sure if this is the right way to do the summary
> report, because I'm a new R user. So please correct me if my idea is doable.
>
>
> Second, would you please tell me what is "more detail" information do you
> need?
>
>
> Thank you,
>
> Kai
>
>
>
> On Friday, July 2, 2021, 12:31:17 PM PDT, Eric Berger <
> ericjberger using gmail.com> wrote:
>
>
> Hard for me to tell without more details but it looks like the following
> has several bugs
>
> for (i in dbtable$Tot_table)
> {
>   Tabname <- as.character(sqldf(sprintf("SELECT Tot_table FROM dbtable",
> i)))
>   summ(Tabname)
> }
>
> Your sprintf() statement seems to use 'i' but actually does not.
> You probably want to rewrite/rearrange this code. More like
>
> x <- sqldf("SELECT Tot_table FROM dbtable")
> for ( Tabname in x )
> summ(Tabname)
>
> no doubt this is wrong but put a browser() call after the x <- sqldf(...)
> line and inspect x and go from there
>
>
>
>
> On Fri, Jul 2, 2021 at 10:20 PM Kai Yang <yangkai9999 using yahoo.com> wrote:
>
> Hello Eric,
>
> Following your suggestion, I modified the code as:
>
> summ <- function(Tabname){
>
>   query <- sprintf(" SELECT * FROM %s",Tabname)
>
>   res <- dbGetQuery(con, query)
>
>   view(dfSummary(res), file =
> "W:/project/_Joe.B/MSSQL/try/summarytools.Tabname.html")
>
>   rm(res)
>
> }
>
>
> for (i in dbtable$Tot_table)
>
> {
>
>   Tabname <- as.character(sqldf(sprintf("SELECT Tot_table FROM dbtable",
> i)))
>
>   summ(Tabname)
>
> }
>
> after submitted the work, I got the error message below:
>
>
>  Error: nanodbc/nanodbc.cpp:1655: 42000: [Microsoft][ODBC Driver 17 for
> SQL Server][SQL Server]Invalid object name 'c'.  [Microsoft][ODBC Driver 17
> for SQL Server][SQL Server]Statement(s) could not be prepared.
> <SQL> ' SELECT * FROM c("BIODBX.MECCUNIQUE2", "BIODBX.QDATA_HTML_DUMMY",
> "BIODBX.SET_ITEMS", "BIODBX.SET_NAMES", "dbo.sysdiagrams",
> "GEMD.ASSAY_DEFINITIONS", "GEMD.ASSAY_DISCRETE_VALUES",
> "GEMD.ASSAY_QUESTIONS", "GEMD.ASSAY_RUNS", "GEMD.BIODBX_DATABASE_SEED",
> "GEMD.BIODBX_USER_SEEDS", "GEMD.BIODBX_USERS", "GEMD.DATA_ENTRY_PAGES",
> "GEMD.DISC_SESSION_QID", "GEMD.DISC_SESSION_STATUS",
> "GEMD.DISC_SESSION_TYPE", "GEMD.DISCREPANCIES",
> "GEMD.DISCREPANCY_QUERY_TEMP", "GEMD.DISCRETE_VALUES",
> "GEMD.ENTERED_DATA_ENTRY_PAGES", "GEMD.ENTRY_GROUPS",
> "GEMD.ExportSampleListNames", "GEMD.FORM_STATUS_BY_SUBJECT",
> "GEMD.GEMD_CODELIST_GROUPS", "GEMD.GEMD_CODELIST_VALUES",
> "GEMD.GEMD_LOT_DEFINITIONS", "GEMD.GEMD_SAMPLES", "GEMD.GEMD_STUDIES",
> "GEMD.MECCUNIQUE", "GEMD.MECCUNIQUE2", "GEMD.MISSING_DI
>
>
> One more question,  in the code of "*view(dfSummary(res), file =
> "W:/project/_Joe.B/MSSQL/try/summarytools.Tabname.html")*",
>
> can Tabname part be replacted automatic also?
>
> Thank you,
>
> Kai
> On Friday, July 2, 2021, 12:06:12 PM PDT, Eric Berger <
> ericjberger using gmail.com> wrote:
>
>
> Modify the summ() function to start like this
>
> summ <- function(Tabname){
>    query <- sprintf(" SELECT * FROM %s",Tabname)
>   res <- dbGetQuery(con, query)
>
> etc
>
> HTH,
> Eric
>
> On Fri, Jul 2, 2021 at 9:39 PM Kai Yang via R-help <r-help using r-project.org>
> wrote:
>
> Hello List,
>
> The previous post look massy. I repost my question. Sorry,
>
>
> I need to generate summary report for many tables (>200 tables). For each
> table, I can use the script to generate report:
> res <- dbGetQuery(con, "SELECT * FROM BIODBX.MECCUNIQUE2")
> view(dfSummary(res), file =
> "W:/project/_Joe.B/MSSQL/try/summarytools.BIODBX.MECCUNIQUE2.html")
> rm(res)
> BIODBX.MECCUNIQUE2 is the name of table.
>
> I have all of tables' name in a data frame. So, I'm trying to write a
> function to do this:
> summ <- function(Tabname){
>   res <- dbGetQuery(con, "SELECT * FROM Tabname")
>   view(dfSummary(res), file =
> "W:/project/_Joe.B/MSSQL/try/summarytools.Tabname.html")
>   rm(res)
> }
> for (i in dbtable$Tot_table)
> {
>   Tabname <- as.character(sqldf(sprintf("SELECT Tot_table FROM dbtable",
> i)))
>   summ(Tabname)
> }
>
> 1. I created  a function summ, the argument is Tabname. I put the Tabname
> in the function. I hope it can be replaced one by one
> 2. the table dbtable contents all tables' name (>200 rows), the field name
> is Tot_table
> 3. I want use "for" to establish a loop, which can automatic generate a
> summary report for each table
>
> but I got error message below:
>  Error: nanodbc/nanodbc.cpp:1655: 42000: [Microsoft][ODBC Driver 17 for
> SQL Server][SQL Server]Invalid object name 'Tabname'.
> [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Statement(s) could
> not be prepared.
>
> <SQL> 'SELECT * FROM Tabname'
> 10. stop(structure(list(message = "nanodbc/nanodbc.cpp:1655: 42000:
> [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid
> object name 'Tabname'.  [Microsoft][ODBC Driver 17 for SQL Server][SQL
> Server]Statement(s) could not be prepared. \n<SQL>
> 'SELECT * FROM Tabname'",
>     call = NULL, cppstack = NULL), class = c("odbc::odbc_error",
> "C++Error", "error", "condition")))
> 9.new_result(connection using ptr, statement, immediate)
> 8.OdbcResult(connection = conn, statement = statement, params = params,
>  immediate = immediate)
> 7..local(conn, statement, ...)
> 6.dbSendQuery(conn, statement, params = params, ...)
> 5.dbSendQuery(conn, statement, params = params, ...)
> 4..local(conn, statement, ...)
> 3.dbGetQuery(con, "SELECT * FROM Tabname")
> 2.dbGetQuery(con, "SELECT * FROM Tabname")
> 1.summ(Tabname)
>
> it seems the tables' name is not successfully pass into query. can someone
> give me an instruction for this?
> many thanks,
> Kai
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list