[R] sbusetting data by rows (every 69 rows)

Mon Aug 26 16:46:50 CEST 2013

Hi R.L.,

No problem.

You may try:
set.seed(24)
 dat1<- as.data.frame(matrix(sample(1:10,2000*3,replace=TRUE),ncol=3))
  lst1<-split(dat1,((seq_len(nrow(dat1))-1)%/%69)+1) 
 lst2<-lapply(lst1,function(x) {colnames(x)<-letters[1:3];x})
res<-lapply(lst2,function(x) {x$z<-with(x,(a-b)/c);x})
head(res[[1]],3)
#  a b c        z
#1 3 1 8 0.250000
#2 3 2 2 0.500000
#3 8 3 3 1.666667

A.K.

Thank you very much for your help AK.  The codes work efficiently! 

Just a following up question -- do you happen to know how to 
select certain columns in each element (since I need to apply 
calculation on multiple columns for each element of the list)? 
For example, list[1] looks like: 
$`1` 
        a       b        c 
1    2.1    1.4    3.4 
2    4.4    2.6    5.5 
3    2.6    0.4    3.0 
... 

$`2` 
          a       b        c 
70    5.1    4.9    5.1 
71    4.4    7.6    8.5 
72    2.8    3.5    6.8 
... 

what I wish to do is something like 
z = (a-b) / c 
for each element ($`1`,$`2`...) 

I tried the following code: 
for( i in 1:23) {                                             ## 
there are 23 elements in the list ( sorry in fact I have 1566 rows in 
total in sample) 
z = (list[[i]]$a - list[[i]]$b) / list[[i]]$c 
} 
which gave me only 49 values, rather than 1566 values. 

Thank you very much! 

Kind regards, 
R.L 

----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Sunday, August 25, 2013 1:28 PM
Subject: Re: sbusetting data by rows (every 69 rows)

#or you could try:

 lst2<- split(dat1,as.numeric(gl(69,69,2000)))
# identical(lst1,lst2)
#[1] TRUE
A.K.

----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Sunday, August 25, 2013 1:17 PM
Subject: Re: sbusetting data by rows (every 69 rows)

Hi,
Try:
set.seed(24)
dat1<- as.data.frame(matrix(sample(1:400,2000*16,replace=TRUE),ncol=16))
 lst1<-split(dat1,((seq_len(nrow(dat1))-1)%/%69)+1)
sapply(lst1,function(x) range(as.numeric(row.names(x))))
#    1   2   3   4   5   6   7   8   9  10  11  12  13  14   15   16   17   18
#[1,]  1  70 139 208 277 346 415 484 553 622 691 760 829 898  967 1036 1105 1174
#[2,] 69 138 207 276 345 414 483 552 621 690 759 828 897 966 1035 1104 1173 1242
#       19   20   21   22   23   24   25   26   27   28   29
#[1,] 1243 1312 1381 1450 1519 1588 1657 1726 1795 1864 1933
#[2,] 1311 1380 1449 1518 1587 1656 1725 1794 1863 1932 2000
 str(lst1[[1]])
#'data.frame':    69 obs. of  16 variables:
# $ V1 : int  118 90 282 208 266 369 112 306 321 102 ...
# $ V2 : int  6 50 115 247 355 109 39 297 35 209 ...
# $ V3 : int  313 67 102 298 367 23 376 91 5 38 ...
# $ V4 : int  207 351 212 342 255 399 239 57 234 79 ...
# $ V5 : int  74 80 289 165 231 193 310 255 98 218 ...
# $ V6 : int  99 91 325 143 398 66 201 337 66 382 ...
# $ V7 : int  339 327 325 274 22 105 106 75 400 167 ...
# $ V8 : int  135 233 91 306 230 140 233 166 210 351 ...
# $ V9 : int  204 203 256 337 25 295 214 288 63 388 ...
# $ V10: int  370 328 161 227 381 164 300 313 303 375 ...
# $ V11: int  171 373 133 345 60 119 215 48 55 367 ...
# $ V12: int  118 309 67 250 286 127 171 248 46 20 ...
# $ V13: int  385 15 282 276 130 166 160 214 58 74 ...
# $ V14: int  90 165 39 154 294 84 106 367 359 145 ...
# $ V15: int  392 290 103 14 111 148 200 331 302 88 ...
# $ V16: int  323 210 167 345 249 325 217 171 150 223 ...
 sapply(lst1,nrow)
# 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
#69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 
#27 28 29 
#69 69 68 

A.K.

Hi There, 

It might be a simple problem but I didn't find a clear solution online. 
The task is quite straightforward -- I have a large data frame with 
more than 2000 rows and 16 columns. For further analysis, I need to 
subset every 69 rows into some new data frames. 

I tried to used the "for" command (code as showing below): 

n = nrow(data) 
w = 69 
for(i in 1:(n-w)){ 
  data= data[i:(i+w),] 
  } 

But it only gave me a subset with the last 69 rows. 
So my question is now how to subset the whole data frame with every 69 rows ( 1st to 69th rows, 70th to 139th rows, etc.). 

Any help will be appreciated.