[R] Reading hierarchical data

jim holtman jholtman at gmail.com
Sun Feb 7 18:05:07 CET 2010


Will this do it for you:

> input <- readLines(textConnection("06470 1 1
+     1 232 0
+     2 230 1
+ 07470 1 0
+     1 240 1
+ 08470 1 0
+     1 227 0
+ 09470 1 0
+     1 213 1
+     2 222 0
+     3 224 1
+ 10470 1 1
+     1 220 0
+     2 211 1
+ 11470 1 0
+     1 217 0
+     2 210 1
+     3 226 1"))
> closeAllConnections()
> fid <- NULL
> dwell <- NULL
> result <- do.call(rbind, lapply(input, function(.line){
+     values <- as.integer(substring(.line, c(1, 7, 9), c(5, 7, 9)))
# assume family record
+     if (values[2] == '1'){
+         fid <<- values[1]
+         dwell <<- values[3]
+         return(NULL)
+     } else {
+         values <- as.integer(substring(.line, c(1, 7, 8, 11), c(5, 7, 9, 11)))
+         return(c(fid=fid, dwell=dwell, pid=values[1], age=values[3],
sex=values[4]))
+     }
+ }))
>
> result
        fid dwell pid age sex
 [1,]  6470     1   1  32   0
 [2,]  6470     1   2  30   1
 [3,]  7470     0   1  40   1
 [4,]  8470     0   1  27   0
 [5,]  9470     0   1  13   1
 [6,]  9470     0   2  22   0
 [7,]  9470     0   3  24   1
 [8,] 10470     1   1  20   0
 [9,] 10470     1   2  11   1
[10,] 11470     0   1  17   0
[11,] 11470     0   2  10   1
[12,] 11470     0   3  26   1


On Sun, Feb 7, 2010 at 10:57 AM, Saba(Home) <sabaric at charter.net> wrote:
>
> I would like to read the following hierarchical data set. There is a family
> record followed by one or more personal records.
> If col. 7 is "1" it is a family record. If it is "2" it is a personal
> record.
> The family record is formatted as follows:
> col. 1-5     family id
> col. 7        "1"
> col. 9        dwelling type code
> The personal record is formatted as follows:
> col. 1-5        personal id
> col. 7   "2"
> col. 8-9        age
> col. 11 sex code
>
> The first six family and accompanying personal records look like this:
> 06470 1 1
>    1 232 0
>    2 230 1
> 07470 1 0
>    1 240 1
> 08470 1 0
>    1 227 0
> 09470 1 0
>    1 213 1
>    2 222 0
>    3 224 1
> 10470 1 1
>    1 220 0
>    2 211 1
> 11470 1 0
>    1 217 0
>    2 210 1
>    3 226 1
>
> I want to create a dataset containing
> . family ID
> . dwelling code
> . person ID
> . age
> . sex code
> The dataset will contain one observation per person, and the with family
> information repeated for people in the same family.
> Can anyone help?
> Thanks,
> Richard Saba
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list