[R] understanding how R determines numbers and characters when creating a data frame

Domenico Vistocco vistocco at unicas.it
Wed Feb 18 22:32:50 CET 2009


Alan Smith wrote:
> Hello R Users and Developers,
>
> I have a basic question about how R works.  Over the past few years I have
> struggled when I try to generate a new data frame that I believe should
> contain numeric data in some columns and character data in others only to
> find everything converted to character data. Is there a general method to
> create data frames that contain the data in the desired format:  numbers as
> numeric and character as a factor etc?  I often have this problem and in the
> worst case I have to export the file and read it back it in.    I have
> emulated a simple example of the problem.  It often happens while using
> "for" loops.  Could someone explain how to avoid this problem by properly
> creating data frames in for loops that can contain both numeric and
> character data.
>
>  
>
> ********Question for example 1.
>
> Why does the cbind command convert the numeric data to character data?  Why
> can't the character data be converted to numeric data using the fix command?
>   
See ?cbind for a detailed explanation.
Anyway, when cbind/rbind is used on vector / matrix it returns matrix. 
Matrix are necessarily composed of the same type of data (see 
Introduction to R): combining character and numeric data you are 
implicitly converting the "short" type (numeric) to the "long" type 
(character).
>
> ### Example 1  #############
>
> data(iris)
>
> obsnum<-NULL
>
> results<-NULL
>
> for(s in unique(as.character(iris$Species))){
>
> temp1<-iris[iris$Species==s,]
>
> obsnum<-length(unique(temp1$Sepal.Length))  # a number
>
>   
Instead of using cbind here:
> out1<-cbind(species=as.character(paste(s)),obsnum)  # number converted to
> character
>   
using data.frame:
out1 <- data.frame(species=as.character(paste(s)),obsnum)

you are telling R to convert character in factor and to preserve the 
numeric:
c(class(results$species),mode(results$species))
c(class(results$obsnum),mode(results$obsnum))

You can keep the character using the stringsAsFactors argument of the 
data.frame() function:
out1 <- data.frame(species=as.character(paste(s)),obsnum, 
stringsAsFactors=FALSE)

And then:
class(results$species)

The message is: if you want to mix up different data type you need lists 
(and data.frame are a special type of list where each component has the 
same number of elements).

Ciao,
domenico
> results<-rbind(out1,results)
>
> }
>
> results
>
> #fix(results)  # cannot convert obsnum to numeric using fix
>
> ####################################
>
>  
>
> ******Question for example 2
>
> Why does adding the data.frame command allow the character data to be
> converted to numeric data using fix command?
>
> ### Example 2  #############
>
> data(iris)
>
> obsnum<-NULL
>
> results<-NULL
>
> for(s in unique(as.character(iris$Species))){
>
> temp1<-iris[iris$Species==s,]
>
> obsnum<-length(unique(temp1$Sepal.Length))
>
> out1<-data.frame(cbind(species=as.character(paste(s)),obsnum)) # number
> converted to character
>
> results<-rbind(out1,results)
>
> }
>
> results
>
> #fix(results)  # can now convert obsnum to numeric using fix
>
>  
>
> ######
>
>  
>
>  
>
> Thank you,
>
> Alan Smith
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>




More information about the R-help mailing list