[R] concatenate values of two columns

kMan kchamberln at gmail.com
Thu May 6 05:49:58 CEST 2010


Dear n.vialma,

Good question! Your columns are of type factor(). Watch out for strange
things with coercion (and so much for the 3 minute reply)! In this solution,
you need a pre-allocated vector to store the results, and your approach is
different depending on the data type you want the resulting vector to be.

Say your data.frame() is defined below:
df<-data.frame(x=c(1:6), var1=c("",1,2,"","",4),var2=c(2,"","",3,4,"")) #
without stringsAsFactors=F

The coercion gets fishy if you try to jump to numeric()
as.numeric(df$var1) # didn't work. as.numeric(factor()) seems to cause
problems
[1] 1 2 3 1 1 4
as.character(df$var1) # works
[1] ""  "1" "2" ""  ""  "4"
as.numeric(as.character(df$var1)) # works
[1] NA  1  2 NA NA  4

Starting with factors, and resulting in either character() or numeric()
dfm<-vector("character", 6) #pre allocate
index1<-df$var1!=""
index2<-df$var2!=""
dfm[index1]<-as.character(df$var1[index1])
dfm[index2]<-as.character(df$var2[index2])
dfm<-as.numeric(dfm)

Starting with character data (stringAsFactors=F), coercion works better with
these data.
df<-data.frame(x=c(1:6), var1=c("",1,2,"","",4),var2=c(2,"","",3,4,""),
stringsAsFactors=F)
as.numeric(df$var1) # works
[1] NA  1  2 NA NA  4

If the data are numeric, the empty character fields coerce NA, so instead,
you test for is.na()
df<-data.frame(x=c(1:6), var1=c(NA,1,2,NA,NA,4),var2=c(2,NA,NA,3,4,NA))
dfm<-vector("numeric",6)
index1<-!is.na(df$var1)
index2<-!is.na(df$var2)
dfm[index1]<-df$var1[index1]
dfm[index2]<-df$var2[index2]

Sincerely,
KeithC.

-----Original Message-----
From: n.vialma at libero.it [mailto:n.vialma at libero.it] 
Sent: Wednesday, May 05, 2010 3:47 AM
To: r-help at r-project.org
Subject: [R] concatenate values of two columns


Dear list,
I'm trying to concatenate the values of two columns but im not able to do
it:

i have a dataframe with the following two columns:

X               VAR1       VAR2
1                                   2
2                   1                
3                   2
4                                   3
5                                   4  
6                  4


what i would like to obtain is:
X               VAR3      
1                   2               
2                   1                
3                   2
4                   3            
5                  4               
6                  4

I try with paste but what I obtain is:
X                 VAR3

1                   NA2
2                    1NA

3                    2NA

4                    NA3

5                    NA4

6                    4NA

 Thanks a lot!!

	[[alternative HTML version deleted]]



More information about the R-help mailing list