[R] subest a data set on two conditions

arun smartpink111 at yahoo.com
Sun Dec 8 18:49:26 CET 2013


Hi,
Try:
#Eitherindx <- which(df$pvalue <0.05)
 indx1 <- sort(c(indx,ifelse(!indx%%2,indx-1,indx+1)))
df[indx1,]

#or
 df[!!with(df,ave(pvalue,((seq-1)%/%2)+1,FUN= function(x) any(x <0.05))),]
A.K.


How can I subset the example data set based on pvalue ( <0.05) and also include the set of each pairs? 
I could subset with this code a<-subset(df, pvalue <0.05) whcih would give me this output 

    Estimate      pvalue seq pairs 
10 0.01133065 0.004946311  10     2 
12 0.02026090 0.039022875  12     2 
17 0.01621716 0.022891429  17     1 
19 0.01555321 0.033382339  19     1 

But I also want to include seq 9, 11, 18 and 20 which are sets of the variable pairs in the output 




> dput(df) 
structure(list(Estimate = c(0.00485470080131958, 0.0017750187497085, 
0.00335445588953967, -0.000584531421758813, 0.00606953408663915, 
-0.00528701750277387, 0.00566389678093939, -0.0157431826077494, 
0.00797445327627353, 0.0113306462560471, 0.00458009238873928, 
0.0202609029566437, 0.000973530938029486, -0.00183247733386492, 
0.00115028173291761, -0.00743448971374577, 0.016217161692567, 
-0.000945376803907414, 0.0155532095509903, -0.00617109741106529 
), pvalue = c(0.171288761250697, 0.507252376337703, 0.328418897915535, 
0.924674871720598, 0.254431502614107, 0.212506044108723, 0.274117055540994, 
0.0963539806017105, 0.156704628343227, 0.00494631086965616, 0.401874172161139, 
0.0390228749596093, 0.817093606803661, 0.581289013427265, 0.776977123239984, 
0.318257277798551, 0.0228914288352906, 0.86659585959993, 0.0333823392712699, 
0.639843703507484), seq = 1:20, pairs = c(1L, 2L, 1L, 2L, 1L, 
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L)), 
datalabel = "", time.stamp = " 8 Dec 2013 12:55", .Names = c("Estimate", 
"pvalue", "seq", "pairs"), formats = c("%9.0g", "%9.0g", "%9.0g", 
"%9.0g"), types = c(255L, 255L, 253L, 253L), val.labels = c("", 
"", "", ""), var.labels = c("Estimate", "pvalue", "seq", "pairs" 
), row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", 
"10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20" 
), version = 12L, class = "data.frame")



More information about the R-help mailing list