[R] Factor vs character in a data.frame vs vector

John Kane jrkrideau at yahoo.ca
Sat Jul 8 15:46:58 CEST 2017


Clearly I have been doing something weird. Thanks >It is possible that somewhere along the way, you set options(stringsAsFactors = FALSE)No not in a long time. I found that it was great for any personal work but any time I tried to use someone' else's raw data, say from a text file, it would mess up the data so I regretfully removed it.
I remember reading a about the options(stringsAsFactors = TRUE) decision for read.table() but somehow missed or forgot it applied to a data.frame. It also still strikes me a bit perverse.
Thanks again.



On Friday, July 7, 2017, 10:25:45 PM EDT, Marc Schwartz <marc_schwartz at me.com> wrote:


> On Jul 7, 2017, at 7:03 PM, John Kane <jrkrideau at yahoo.ca> wrote:
> 
> Thanks Marc.
> It never occurred to me that I would need a ""stringsAsFactors" expression in a data.frame.  I could have sworn I never did before when mocking up some data but clearly I was wrong or there has been a change in R v. 3.4.1 which seems unlikely.


Welcome John.

Going back to the old NEWS files, the 'stringsAsFactors' argument for data.frame() appears in version 2.4.0, which was released on 2006-10-03.

It is possible that somewhere along the way, you set options(stringsAsFactors = FALSE) in your .Rprofile, which would change the default behavior. I know that some folks do that, as they do not like the default coercion to factors, both for data.frame() and for the read.table() family.

Other alternatives would be to use the 'colClasses' argument to explicitly set such vectors to character, or to use I(...) to create AsIs class columns.

Regards,

Marc


> 
> 
> 
> On Friday, July 7, 2017, 10:37:29 AM EDT, Marc Schwartz <marc_schwartz at me.com> wrote:
> 
> 
> 
> > On Jul 7, 2017, at 6:03 AM, John Kane via R-help <r-help at r-project.org> wrote:
> > 
> > This is not  serious problem but I just wonder if someone can explain what is happening.
> > The same command within a dataframe is giving me a factor and as a plain vector is giving me a character.  It's probably something simple that I have read and forgotten but I thought I'd ask.
> > Thanks
> > 
> > #================================================
> > dat1 <- data.frame(aa = letters[1:10])
> > str(dat1)
> > data.frame':    10 obs. of  1 variable:
> > $ aa....letters.1.10.: Factor w/ 10 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10#=============================================================
> > bb = letters[1:10]
> > str(bb)
> > chr [1:10] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
> > #==============================================================
> > 
> 
> 
> See the 'stringsAsFactors' argument in ?data.frame.
> 
> dat1 <- data.frame(aa = letters[1:10], stringsAsFactors = FALSE)
> 
> 
> > str(dat1)
> 'data.frame':    10 obs. of  1 variable:
> 
> $ aa: chr  "a" "b" "c" "d" ...
> 
> 
> Regards,
> 
> Marc Schwartz
> 

	[[alternative HTML version deleted]]



More information about the R-help mailing list