[R] Lahman Baseball Data Using R DBI Package

Bill Dunlap w||||@mwdun|@p @end|ng |rom gm@||@com
Thu Oct 8 18:26:29 CEST 2020


This is really a feature of SQL, not R.  SQL requires that you double quote
column names that start with numbers, include spaces, etc., or that are SQL
key words.  E.g.,

> d <- data.frame(Order=c("sit","stay","heel"),
Where=c("here","there","there"), From=c("me","me","you"))
> sqldf::sqldf("select Order,Where,From from d WHERE From=\"me\"")
Error: near "Order": syntax error
> sqldf::sqldf("select \"Order\",\"Where\",\"From\" from d Where
\"From\"=\"me\"")
  Order Where From
1   sit  here   me
2  stay there   me

You may as well double quote all column names in SQL queries.

-Bill

On Wed, Oct 7, 2020 at 9:57 PM William Michels <wjm1 using caa.columbia.edu>
wrote:

> Hi Philip,
>
> You've probably realized by now that R doesn't like column names that
> start with a number. If you try to access an R-dataframe column named
> 2B or 3B with the familiar "$" notation, you'll get an error:
>
> > library(DBI)
> > library(RSQLite)
> > con2 <- dbConnect(SQLite(), "~/R_Dir/lahmansbaseballdb.sqlite")
> > Hack12Batting <- dbGetQuery(con2,"SELECT * FROM batting WHERE yearID =
> 2018 AND AB >600 ORDER BY AB DESC")
> > Hack12Batting$AB
>  [1] 664 661 639 632 632 632 626 623 620 618 617 613 606 605 602
> > Hack12Batting$3B
> Error: unexpected numeric constant in "Hack12Batting$3"
>
> How to handle? You can rename columns on-the-fly by piping. See
> reference [1] and use either library(magrittr) or library(dplyr) or a
> combination thereof:
>
> library(magrittr)
> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
> ORDER BY AB DESC") %>% set_colnames(make.names(colnames(.)))
>
> #OR one of the following:
>
> library(dplyr)
> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
> ORDER BY AB DESC") %>% rename(X2B = `2B`, X3B = `3B`)
>
> library(dplyr)
> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
> ORDER BY AB DESC") %>% `colnames<-`(make.names(colnames(.)))
>
> library(dplyr)
> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
> ORDER BY AB DESC") %>% magrittr::set_colnames(make.names(colnames(.)))
>
> Best, Bill.
>
> W. Michels, Ph.D.
>
> [1]
> https://stackoverflow.com/questions/28100780/use-with-replacement-functions-like-colnames
>
>
>
>
>
>
>
>
>
>
> On Fri, Oct 2, 2020 at 7:34 PM Bill Dunlap <williamwdunlap using gmail.com>
> wrote:
> >
> > The double quotes are required by SQL if a name is not of the form
> > letter-followed-by-any-number-of-letters-or-numbers or if the name is a
> SQL
> > keyword like 'where' or 'select'.  If you are doing this from a function,
> > you may as well quote all the names.
> >
> > -Bill
> >
> > On Fri, Oct 2, 2020 at 6:18 PM Philip <herd_dog using cox.net> wrote:
> >
> > > The \”2B\” worked.  Have no idea why.  Can you point me somewhere that
> can
> > > explain this to me.
> > >
> > > Thanks,
> > > Philip
> > >
> > > *From:* Bill Dunlap
> > > *Sent:* Friday, October 2, 2020 3:54 PM
> > > *To:* Philip
> > > *Cc:* r-help
> > > *Subject:* Re: [R] Lahman Baseball Data Using R DBI Package
> > >
> > > Have you tried putting double quotes around 2B and 3B:  "...2B, 3B,
> ..."
> > > -> "...\"2B\",\"3B\",..."?
> > >
> > > -Bill
> > >
> > > On Fri, Oct 2, 2020 at 3:49 PM Philip <herd_dog using cox.net> wrote:
> > >
> > >> I’m trying to pull data from one table (batting) in the Lahman
> Baseball
> > >> database.  Notice X2B for doubles and X3B for triples – fourth and
> fifth
> > >> from the right.
> > >>
> > >> The dbGetQuery function runs fine when I leave there two out but I get
> > >> error messages (in red) when I include 2B/3B or X2B/X3B.
> > >>
> > >> Can anyone give me some direction?
> > >>
> > >> Thanks,
> > >> Philip Heinrich
> > >>
> > >>
> ***************************************************************************************************************************************************
> > >> tail(dbReadTable(Lahman,"batting"))
> > >>
> > >> ID                       playerID      yearID    stint teamID team_ID
> > >> lgID   G        G_batting   AB     R     H       X2B    X3B   HR
>  RBI   SB
> > >> 107414 107414 yastrmi01      2019       1       SFN       2920
> > >> NL     107        NA          371   64  101      22       3     21
> > >> 55    2
> > >> 107416 107416 yelicch01      2019        1       MIL       2911
> > >> NL     130        NA          489 100  161      29       3     44
> 97   30
> > >> 107419 107419 youngal01     2019       1       ARI        2896
> > >> NL       17        NA            25     1      1        0       0
> > >> 0      0     0
> > >> 107420 107420 zagunma01   2019        1      CHN       2901      NL
> > >> 30        NA             36     2      9        3       0      0
> > >> 5     0
> > >> 107422 107422 zavalse01      2019        1      CHA       2900
> > >> AL        5        NA             12     1      1        0       0
> > >> 0       0     0
> > >> 107427 107427 zimmery01     2019        1      WAS      2925      NL
> > >> 52        NA           171   20    44        9      0      6      27
>    0
> > >> 107428 107428 zobribe01       2019        1      CHN      2901
> > >> NL      47        NA           150   24   39         5      0      1
> > >> 17     0
> > >> 107429 107429 zuninmi01       2019        1      TBA       2922
> > >> AL      90        NA           266    30  44       10      1      9
> > >> 32     0
> > >>
> > >>
> > >> Hack11Batting <- dbGetQuery(Lahman,"SELECT
> > >> playerID,yearID,AB,R,H,2B,3B,HR,
> > >>                             RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
> > >> batting
> > >>                             WHERE yearID = 2018 AND AB >99")
> > >> Error: unrecognized token: "2B"
> > >>
> > >> Hack11Batting <- dbGetQuery(Lahman,"SELECT
> > >> playerID,yearID,AB,R,H,X2B,X3B,HR,
> > >>                             RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
> > >> batting
> > >>                             WHERE yearID = 2018 AND AB >99")
> > >> Error: no such column: X2B
> > >>
> > >>         [[alternative HTML version deleted]]
> > >>
> > >> ______________________________________________
> > >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list