[R] Data Step: Outer Join Rows, Inner Join Columns

schimpanski at gmx.de schimpanski at gmx.de
Tue Sep 29 09:04:34 CEST 2009


Hi,

how can I do a data step in R which can be best described in SQL terminology as an outer join of rows (eg cases) and inner join of columns (variables)? In case of conflicting non-missing values the first dataset has higher always priority. Missing values should always be replaced by valid values (cf. coalesce function in SQL):

INPUT DATASET 1:

  A  B  C
1 11 NA 31
2 12 22 32
3 NA 23 33

INPUT DATASET 2:

  C  D  E
2 NA 42 52   
3 NA 43 53
4 34 44 54

OUTPUT DATASET:
   A  B  C  D  E
1  11 NA 31 NA NA
2  12 22 32 42 52
3  NA 23 33 43 53
4  NA NA 34 44 54

Thanks in advance,

Will




More information about the R-help mailing list