[Rd] should `data` respect default.stringsAsFactors()?

Michael Nelson michael.nelson at sydney.edu.au
Fri Feb 19 05:58:56 CET 2016


As Peter pointed out.


data loads data from packages. Various formats are supported. The package author(s) will decide how best to ship (and load) any such data. 


When you call `data(iris)`, it loads iris as it is defined in the datasets package

The definition can be seen here:

https://github.com/wch/r-source/blob/trunk/src/library/datasets/data/iris.R

You will note that Species is explicitly a factor and it won't have been read in by read.table, but by being "source()d" because it is a .R file.


Michael 




________________________________________
From: R-devel [r-devel-bounces at r-project.org] on behalf of Cook, Malcolm [MEC at stowers.org]
Sent: Friday, 19 February 2016 11:03 AM
To: 'peter dalgaard'
Cc: r-devel at stat.math.ethz.ch
Subject: Re: [Rd] should `data` respect default.stringsAsFactors()?

Hi Peter,

Sorry if I was not clear.  Perhaps an example will make my point:

> data(iris)
> class(iris$Species)
[1] "factor"
> write.table(iris,'data/myiris.tab')
> data(myiris)
> class(myiris$Species)
[1] "factor"
> rm(myiris)
> options(stringsAsFactors = FALSE)
> data(myiris)
> class(myiris$Species)
[1] "factor"
> myiris<-read.table("data/myiris.tab",header=TRUE)
> class(myiris$Species)
[1] "character"

I am surprised to find that in the above
          setting the global option stringsAsFactors = FALSE does NOT effect how Species is being read in by the `data` function
whereas
        setting the global option stringsAsFactors = FALSE DOES effect how Species is being read in by read.table

especially since data is documented as calling read.table.

In my opinion, one or the other should change (the behavior of data, or the documentation).

<bleep> <bleep>,

~ Malcolm


 > -----Original Message-----
 > From: peter dalgaard [mailto:pdalgd at gmail.com]
 > Sent: Thursday, February 18, 2016 3:32 PM
 > To: Cook, Malcolm <MEC at stowers.org>
 > Cc: r-devel at stat.math.ethz.ch
 > Subject: Re: [Rd] should `data` respect default.stringsAsFactors()?
 >
 > What the <bleep> are you on about? data() does many things, only some of
 > which call read.table() et al., and the ones that do have no special treatment
 > of stringsAsFactors.
 >
 > -pd
 >
 > > On 18 Feb 2016, at 21:25 , Cook, Malcolm <MEC at stowers.org> wrote:
 > >
 > > Hiya,
 > >
 > > Probably been debated elsewhere....
 > >
 > > I note that R's `data` function does not respect default.stringsAsFactors
 > >
 > > By my lights, it should, especially as it is documented to call read.table,
 > which DOES respect.
 > >
 > > Oh, but:  http://r.789695.n4.nabble.com/stringsAsFactors-FALSE-
 > tp921891p921893.html
 > >
 > > Compelling.  I have to agree.
 > >
 > > So, I change my mind.
 > >
 > > By my lights, `data` should then be documented to NOT respect
 > default.stringsAsFactors.
 > >
 > > Else?
 > >
 > > ~Malcolm Cook
 > >
 > > ______________________________________________
 > > R-devel at r-project.org mailing list
 > > https://stat.ethz.ch/mailman/listinfo/r-devel
 >
 > --
 > Peter Dalgaard, Professor,
 > Center for Statistics, Copenhagen Business School
 > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 > Phone: (+45)38153501
 > Office: A 4.23
 > Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
 >
 >
 >
 >
 >
 >
 >
 >

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list