[R] Creating new vectors from other dataFrames

arun smartpink111 at yahoo.com
Thu Aug 8 20:43:52 CEST 2013



Also, a more compact solution would be:library(plyr)
#Creating a different dataframe as data2 columns were having almost the same as data1
set.seed(24)
 data3<- as.data.frame(matrix(sample(1:40,6*4,replace=TRUE),ncol=6))
 colnames(data3)<- colnames(data2)
 join(data3,data1)
#Joining by: a, b, c, z
 #  a  b  c  f  g  z  d  e
#1 12 27 33 27  8  4 NA NA
#2  9 37 11 27  2 23 NA NA
#3 29 12 25 13 21 30 NA NA
#4 21 31 15 37  6  6 NA NA
 join(data1,data3)
#Joining by: a, b, c, z
#  a b  c  d  e  z  f  g
#1 1 5  9 13 17 21 NA NA
#2 2 6 10 14 18 22 NA NA
#3 3 7 11 15 19 23 NA NA
#4 4 8 12 16 20 24 NA NA

A.K.




A.K.





________________________________
From: Steven Ranney <steven.ranney at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Cc: R help <r-help at r-project.org> 
Sent: Thursday, August 8, 2013 2:21 PM
Subject: Re: [R] Creating new vectors from other dataFrames



This is exactly what I'm looking for.  Each dataFrame will have those columns that are endemic to the other filled with NA.

Thanks.


Steven H. Ranney


On Thu, Aug 8, 2013 at 12:17 PM, arun <smartpink111 at yahoo.com> wrote:

HI,
>
>Not sure about your expected result.
>
>library(plyr)
>data2New<-join_all(lapply(setdiff(names(data1), names(data2)),function(x) {data2[,x]<-NA; data2}))
>
>data1New<-join_all(lapply(setdiff(names(data2), names(data1)),function(x){data1[,x]<-NA;data1}))
> data1New
>#  a b  c  d  e  z  f  g
>#1 1 5  9 13 17 21 NA NA
>#2 2 6 10 14 18 22 NA NA
>#3 3 7 11 15 19 23 NA NA
>#4 4 8 12 16 20 24 NA NA
>A.K.
>
>
>
>
>----- Original Message -----
>From: Steven Ranney <steven.ranney at gmail.com>
>To: "r-help at r-project.org" <r-help at r-project.org>
>Cc:
>Sent: Thursday, August 8, 2013 2:01 PM
>Subject: [R] Creating new vectors from other dataFrames
>
>I have two data frames
>
>data1 <- as.data.frame(matrix(data=c(1:4,5:8,9:12,13:24), nrow=4, ncol=6,
>byrow=F, dimnames=list(c(1:4),c("a","b","c","d","e","z"))))
>data2 <- as.data.frame(matrix(data=c(1:4,5:8,9:12,37:48), nrow=4, ncol=6,
>byrow=F, dimnames=list(c(1:4),c("a","b","c","f","g","z"))))
>
>that have some common column names.
>
>Comparing the names of the columns within each data frame to the other
>
>setdiff(names(data1), names(data2))
>setdiff(names(data2), names(data1))
>
>provides which columns are different.
>
>For each column that appears in data1 that DOES NOT appear in data2, I need
>to create those columns and fill them with NA values.  The same is true for
>the reverse.  So, I can create a vector of new column names that need to be
>filled with NA values, but here is where I'm stuck.  I don't know how to
>get the names from inside the vector into the respective dataFrame.
>
>tmp1 <- as.factor(paste("data2$", setdiff(names(data1), names(data2)),
>sep=""))
>tmp2 <- as.factor(paste("data1$", setdiff(names(data2), names(data1)),
>sep=""))
>
>Of course, if it were as simple as only a few columns, I could do all of
>this by hand, but in my original data frames, I have 60 different columns
>that need to be created and filled with NA values for both data1 and data2.
>
>Eventually, the point of this exercise is so that I can rbind(data1, data2)
>and create a SQL table out of the merged dataFrames.  Unfortunately, I
>can't rbind() everything until the column names are common across both
>data1 and data2.
>
>Thoughts?
>
>Thanks -
>
>SR
>
>
>
>Steven H. Ranney
>
>    [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list