[R] factor always have type integer

Gabor Grothendieck ggrothendieck at myway.com
Thu Sep 9 06:51:04 CEST 2004


Note that I(v2) stores v2 as type character but not as class character.  
For example,

R> DF <- data.frame(x = c("a", "b"), y = I(c("a", "b")), z = I(c("a", "b")))
R> class(DF$z) <- "character"

R> sapply(DF, typeof) # y and z do have the same type
          x           y           z 
  "integer" "character" "character" 

R> sapply(DF, class)  # but the 3 columns have different classes
          x           y           z 
   "factor"      "AsIs" "character" 

 



Roger D. Peng <rpeng <at> jhsph.edu> writes:

: 
: In some cases it makes sense to store "character" variables as factors 
: (integers with labels) since this can take up much less memory.  If 
: you really want to store `v2' as character, just do
: 
: data.frame(v1, I(v2))
: 
: -roger
: 
: Erich Neuwirth wrote:
: > typeof applied to a factor always seems to return "integer",
: > independently of the type of the levels.
: > This has a strange side effect.
: > When a variable is "imported" into a data frame,
: > its type changes.
: > character variables automatically are converted
: > to factors when imported into data frames.
: > 
: > Here is an example:
: > 
: >  > v1<-1:3
: >  > v2<-c("a","b","c")
: >  > df<-data.frame(v1,v2)
: >  > typeof(v2)
: > [1] "character"
: >  > typeof(df$v2)
: > [1] "integer"
: > 
: > It is somewhat surprising that
: > the types of v2 and df$v2 are different.
: > 
: > the answer is to do
: > levels(df$v2)[df$v2]
: > but that is somewhat involved.
: > 
: > Should the types not be identical, and typeof applied to factors
: > return the type of the levels?




More information about the R-help mailing list