[R] a question about data manipulation

jim holtman jholtman at gmail.com
Tue Aug 2 17:56:18 CEST 2005


use 'split'

> x.1 <- data.frame(COL1=1:50, COL2=50:1, id=sample(1:4,50,T))
> x.2 <- split(x.1, x.1$id)
> str(x.2)
List of 4
 $ 1:`data.frame':      10 obs. of  3 variables:
  ..$ COL1: int [1:10] 5 10 11 12 22 24 27 34 38 47
  ..$ COL2: int [1:10] 46 41 40 39 29 27 24 17 13 4
  ..$ id  : int [1:10] 1 1 1 1 1 1 1 1 1 1
 $ 2:`data.frame':      13 obs. of  3 variables:
  ..$ COL1: int [1:13] 1 2 14 16 19 25 26 28 30 31 ...
  ..$ COL2: int [1:13] 50 49 37 35 32 26 25 23 21 20 ...
  ..$ id  : int [1:13] 2 2 2 2 2 2 2 2 2 2 ...
 $ 3:`data.frame':      14 obs. of  3 variables:
  ..$ COL1: int [1:14] 3 8 9 13 17 23 32 36 39 42 ...
  ..$ COL2: int [1:14] 48 43 42 38 34 28 19 15 12 9 ...
  ..$ id  : int [1:14] 3 3 3 3 3 3 3 3 3 3 ...
 $ 4:`data.frame':      13 obs. of  3 variables:
  ..$ COL1: int [1:13] 4 6 7 15 18 20 21 29 35 37 ...
  ..$ COL2: int [1:13] 47 45 44 36 33 31 30 22 16 14 ...
  ..$ id  : int [1:13] 4 4 4 4 4 4 4 4 4 4 ...
> names(x.2)
[1] "1" "2" "3" "4"
> x.2[['1']]
   COL1 COL2 id
5     5   46  1
10   10   41  1
11   11   40  1
12   12   39  1
22   22   29  1
24   24   27  1
27   27   24  1
34   34   17  1
38   38   13  1
47   47    4  1
> x.2[['3']]
   COL1 COL2 id
3     3   48  3
8     8   43  3
9     9   42  3
13   13   38  3
17   17   34  3
23   23   28  3
32   32   19  3
36   36   15  3
39   39   12  3
42   42    9  3
44   44    7  3
45   45    6  3
49   49    2  3
50   50    1  3
> 


On 8/2/05, qi zhang <shellyzhang77 at gmail.com> wrote:
> Dear R-user,
>  I have a simple question, I just can't figure out a easy way to handle it.
>  My importing data x is like this:
>  COL1 COL2 id
> 1 12 49 1
> 2 70 120 1
> 3 58 124 1
> 51 14 13 2
> 52 88 100 2
> 53 90 134 2
>  I want to change the format of the data, i want to group data into
> differenct part according id,so that when i use x[1], which will refer me to
> the information about first id.I use the command:
> 
> list(list(N=2,n=c(100,150),matrix(c(x[x$id==1,][,1],x[x$id==1,][,2]),nr=2,nc=3)),list(N=2,n=c(100,150),matrix(c(x[x$id==2,][,1],x[x$id==2,][,2]),nr=2,nc=3)))
> 
> so the data becomes :
> 
> [[1]]
> [[1]]$N
> [1] 2
> 
> [[1]]$n
> [1] 100 150
> 
> [[1]][[3]]
> [,1] [,2] [,3]
> [1,] 12 58 120
> [2,] 70 49 124
> 
> 
> [[2]]
> [[2]]$N
> [1] 2
> 
> [[2]]$n
> [1] 100 150
> 
> [[2]][[3]]
> [,1] [,2] [,3]
> [1,] 14 90 100
> [2,] 88 13 134
> 
> This is the format I want, but problem is that for my data, id is not only 1
> to 2,but 1 to 100, so my code is not efficient. Could you help me find a
> efficient way? Thanks.
> 
>  Qi Zhang
> 
> PhD student,
> 
> University of Cincinnati
> 
>        [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 


-- 
Jim Holtman
Convergys
+1 513 723 2929

What the problem you are trying to solve?




More information about the R-help mailing list