[Rd] should `data` respect default.stringsAsFactors()?

Joshua Ulrich josh.m.ulrich at gmail.com
Fri Feb 19 01:39:32 CET 2016


On Thu, Feb 18, 2016 at 6:03 PM, Cook, Malcolm <MEC at stowers.org> wrote:
> Hi Peter,
>
> Sorry if I was not clear.  Perhaps an example will make my point:
>
>> data(iris)
>> class(iris$Species)
> [1] "factor"
>> write.table(iris,'data/myiris.tab')
>> data(myiris)
>> class(myiris$Species)
> [1] "factor"
>> rm(myiris)
>> options(stringsAsFactors = FALSE)
>> data(myiris)
>> class(myiris$Species)
> [1] "factor"
>> myiris<-read.table("data/myiris.tab",header=TRUE)
>> class(myiris$Species)
> [1] "character"
>
> I am surprised to find that in the above
>           setting the global option stringsAsFactors = FALSE does NOT effect how Species is being read in by the `data` function
> whereas
>         setting the global option stringsAsFactors = FALSE DOES effect how Species is being read in by read.table
>
> especially since data is documented as calling read.table.
>
To be explicit, it's documented as calling read.table(..., header =
TRUE) in this case, but it actually calls read.table(..., header =
TRUE, as.is = FALSE), which results in class(myiris$Species) of
"factor".

R> myiris<-read.table("data/myiris.tab",header=TRUE,as.is=FALSE)
R> class(myiris$Species)
[1] "factor"

So it seems like adding as.is = FALSE to the call in the documentation
would clear this up.

> In my opinion, one or the other should change (the behavior of data, or the documentation).
>
> <bleep> <bleep>,
>
> ~ Malcolm
>
>
>  > -----Original Message-----
>  > From: peter dalgaard [mailto:pdalgd at gmail.com]
>  > Sent: Thursday, February 18, 2016 3:32 PM
>  > To: Cook, Malcolm <MEC at stowers.org>
>  > Cc: r-devel at stat.math.ethz.ch
>  > Subject: Re: [Rd] should `data` respect default.stringsAsFactors()?
>  >
>  > What the <bleep> are you on about? data() does many things, only some of
>  > which call read.table() et al., and the ones that do have no special treatment
>  > of stringsAsFactors.
>  >
>  > -pd
>  >
>  > > On 18 Feb 2016, at 21:25 , Cook, Malcolm <MEC at stowers.org> wrote:
>  > >
>  > > Hiya,
>  > >
>  > > Probably been debated elsewhere....
>  > >
>  > > I note that R's `data` function does not respect default.stringsAsFactors
>  > >
>  > > By my lights, it should, especially as it is documented to call read.table,
>  > which DOES respect.
>  > >
>  > > Oh, but:  http://r.789695.n4.nabble.com/stringsAsFactors-FALSE-
>  > tp921891p921893.html
>  > >
>  > > Compelling.  I have to agree.
>  > >
>  > > So, I change my mind.
>  > >
>  > > By my lights, `data` should then be documented to NOT respect
>  > default.stringsAsFactors.
>  > >
>  > > Else?
>  > >
>  > > ~Malcolm Cook
>  > >
>  > > ______________________________________________
>  > > R-devel at r-project.org mailing list
>  > > https://stat.ethz.ch/mailman/listinfo/r-devel
>  >
>  > --
>  > Peter Dalgaard, Professor,
>  > Center for Statistics, Copenhagen Business School
>  > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>  > Phone: (+45)38153501
>  > Office: A 4.23
>  > Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>  >
>  >
>  >
>  >
>  >
>  >
>  >
>  >
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com
R/Finance 2016 | www.rinfinance.com



More information about the R-devel mailing list