[R] merging to data.frames whose columns are different but follow a pattern.

Dennis Murphy djmuser at gmail.com
Tue Oct 25 21:15:05 CEST 2011


Hi:

Try this:

INDIVIDUAL <- transform(INDIVIDUAL, IDHOUS = IDPERS %/% 100)
merge(INDIVIDUAL, HOUSHOLD, by = c('IDHOUS', 'T'))

   IDHOUS T IDPERS SEX PC02 GHS Single COM2 NBPERS NBKID
1      41 1   4101   1    9   1     NA    2      5     3
2      41 1   4102   2    0   0     NA    2      5     3
3      41 2   4101   1   10   1     NA    2      5     3
4      41 2   4102   2    5   1     NA    2      5     3
5      41 3   4101   1   10   1     NA    2      5     3
6      41 3   4102   2    3   0     NA    2      5     3
7      41 4   4101   1   10   1     NA    2      5     3
8      41 4   4102   2    9   1     NA    2      5     3
9      51 1   5101   1    8   1     NA    2      1     0
10     51 2   5101   1    8   1     NA    2      1     0
11     51 3   5101   1    8   1     NA    2      1     0
12     51 4   5101   1    8   1     NA    2      1     0

HTH,
Dennis

On Tue, Oct 25, 2011 at 9:43 AM, ManuelS <steiner-manuel at web.de> wrote:
> Hi,
>
> I'm working with panel data from the Swiss Houshold Panel (SHP). The data i
> got came in the following way:
> 1.) 12 *different* /individual/ files - one for each year .
> 2.) 12 *different* /houshold/ files - again: one for each year
>
> Each file came in the SPSS format (.sav). I implemented all the files in R
> an managed (via rename, cbind, rbind, merge etc.) to get *two* files.
> The first file is a combination of all 12 /individual/ files orderd by
> IDPERS (unique individual identification) and T (=Time 1...12)
> The second file contains all houshold waves (1-12) and is orderd by IDHOUS
> (unique houshold identification) and T (=1...12).
>
> Now, to the problem:
> As the following except shows: the individual id used in the individual file
> (IDPERS) is derived from the houshold id (IDHOUS).
>
>>head(INDIVIDUAL,20)
>   IDPERS  T SEX PC02 GHS Single
> 1    4101  1   1    9   1     NA
> 2    4101  2   1   10   1     NA
> 3    4101  3   1   10   1     NA
> 4    4101  4   1   10   1     NA
> .
> .
> .
> 13   4102  1   2    0   0     NA
> 14   4102  2   2    5   1     NA
> 15   4102  3   2    3   0     NA
> 16   4102  4   2    9   1     NA
> .
> .
> .
> 26   5101  1   1    8   1     NA
> 27   5101  2   1    8   1     NA
> 28   5101  3   1    8   1     NA
> 29   5101  4   1    8   1     NA
>
>
>> head(HOUSHOLD,20)
>   IDHOUS  T COM2 NBPERS NBKID
> 1      41  1    2      5     3
> 2      41  2    2      5     3
> 3      41  3    2      5     3
> 4      41  4    2      5     3
> .
> .
> .
> 13     51  1    2      1     0
> 14     51  2    2      1     0
> 15     51  3    2      1     0
> 16     51  4    2      1     0
>
> I want to merge the two files (data.frames) according to their ID's and the
> time dimension T. Since IDHOUS and IDPERS are not the same I cannot use
> /merge() /. Doing the task "by hand" in a excel spread sheet or any
> equivalent is possible but would probably take ages.
> So my question is:
>
> Does anyone know a way, a command, a trick to merge two files whose columns
> are different but follow a pattern as described above?
>
> I am gratful for any kind of input.
>
> Manuel Steiner
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/merging-to-data-frames-whose-columns-are-different-but-follow-a-pattern-tp3937376p3937376.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list