[R] How to subset my dataframe? (a bit tricky)

markleeds at verizon.net markleeds at verizon.net
Tue Jun 16 22:24:47 CEST 2009


   Hi Bill: I was trying to do below myself but was having problems. So I took
   your solution and made another one. yours was working
   a little weirdly because I don't think the person wants to keep rows where
   there are 2 dnv's in a row and he/she also wanted to keep
   the  row  if the second column has a "dnv".  So, below is essentially
   plagiarism with a minor fix. Thanks.
   DF[unique(unlist(sapply(3:ncol(DF),function(.col) {
   Â Â Â  keeprow <- which(( d[,.col]=="dnv" & d[,.col-1]!="0" & d[,.col-1] !=
   "dnv") | (d[,2] == "dnv"))
   }))),]

   On Jun 16, 2009, William Dunlap <wdunlap at tibco.com> wrote:

     > -----Original Message-----
     > From: [1]r-help-bounces at r-project.org
     > [mailto:[2]r-help-bounces at r-project.org] On Behalf Of Mark Na
     > Sent: Tuesday, June 16, 2009 11:27 AM
     > To: [3]r-help at r-project.org
     > Subject: [R] How to subset my dataframe? (a bit tricky)
     >
     > Hi R-helpers,
     >
     > I would like to subset my dataframe, keeping only those rows which
     > satisfy the following conditions:
     >
     > 1) the string "dnv" is found in at least one column;
     > 2) the value in the column previous to the one "dnv" is found
     > in is not "0"
     Suppose your data.frame is called 'd'. Then try looping over
     its columns:
     keep <- rep(FALSE, nrow(d))
     if  (ncol(d)>2) for(i in 3:ncol(d)) keep <- keep | ( d[,i]=="drv" &
     d[,i-1]!="0")
     so
     d[keep,]
     is the subset you want.
     Bill Dunlap
     TIBCO Software Inc - Spotfire Division
     wdunlap tibco.com
     >
     > Here's what my data look like:
     >
     > Â Â Â  POND_ID 2009-05-07 2009-05-15 2009-05-21 2009-05-28 2009-06-04
     >
     > 4       101       0.15          0        dnv     Â
      dnv        dnv
     > 7       102          0        dnv        dnv     Â
      dnv        dnv
     > 87      103       0.15        dnv          1       Â
     Â 1 Â  Â  Â  Â  Â 1
     > 99      104        dnv       0.25          1       Â
     Â 1 Â  Â  Â  0.75
     >
     > So, for above example, the new dataframe would not contain POND_ID 101
     > or 102 (because there is a 0 before the dnv) but it WOULD contain
     > POND_ID 103 (because there is a 0.15 before the dnv) and 104 (because
     > dnv occurs in the first column, so cannot be preceded by a 0).
     >
     > One extra twist: I would like to retain rows in the new dataframe
     > which satisfy the above conditions even if they also have a "0" then
     > "dnv" sequence preceding or following the "problem" , e.g., the
     > following rows would be retained in the new dataframe
     >
     > Â  Â POND_ID 2009-05-07 2009-05-15 2009-05-21 2009-05-28 2009-06-04
     >
     > 100     105       0.15        dnv          1     Â
     Â Â  0 Â  Â  Â Â  dnv
     > 101     106       0           dnv          1     Â
     Â Â  0.15Â  Â Â  dnv
     >
     > Thanks in advance for any help you might provide.
     >
     > (I hope I've provided enough of an example; I could also provide a
     > .csv file if that would help.)
     >
     > Mark Na
     >
     > ______________________________________________
     > [4]R-help at r-project.org mailing list
     > [5]https://stat.ethz.ch/mailman/listinfo/r-help
     > PLEASE do read the posting guide
     > [6]http://www.R-project.org/posting-guide.html
     > and provide commented, minimal, self-contained, reproducible code.
     >
     ______________________________________________
     [7]R-help at r-project.org mailing list
     [8]https://stat.ethz.ch/mailman/listinfo/r-help
     PLEASE do read the posting guide
     [9]http://www.R-project.org/posting-guide.html
     and provide commented, minimal, self-contained, reproducible code.

References

   1. mailto:r-help-bounces at r-project.org
   2. mailto:r-help-bounces at r-project.org
   3. mailto:r-help at r-project.org
   4. mailto:R-help at r-project.org
   5. https://stat.ethz.ch/mailman/listinfo/r-help
   6. http://www.R-project.org/posting-guide.html
   7. mailto:R-help at r-project.org
   8. https://stat.ethz.ch/mailman/listinfo/r-help
   9. http://www.R-project.org/posting-guide.html



More information about the R-help mailing list