[Rd] invert argument in grep

Romain Francois rfrancois at mango-solutions.com
Sun Nov 12 11:31:42 CET 2006


Duncan Murdoch wrote:
> On 11/10/2006 12:52 PM, Romain Francois wrote:
>> Duncan Murdoch wrote:
>>> On 11/9/2006 5:14 AM, Romain Francois wrote:
>>>> Hello,
>>>>
>>>> What about an `invert` argument in grep, to return elements that 
>>>> are *not* matching a regular expression :
>>>>
>>>> R> grep("pink", colors(), invert = TRUE, value = TRUE)
>>>>
>>>> would essentially return the same as :
>>>>
>>>> R> colors() [ - grep("pink", colors()) ]
>>>>
>>>>
>>>> I'm attaching the files that I modified (against today's tarball) 
>>>> for that purpose.
>>>
>>> I think a more generally useful change would be to be able to return 
>>> a logical vector with TRUE for a match and FALSE for a non-match, so 
>>> a simple !grep(...) does the inversion.  (This is motivated by the 
>>> recent R-help discussion of the fact that x[-selection] doesn't 
>>> always invert the selection when it's a vector of indices.)
>>>
>>> A way to do that without expanding the argument list would be to allow
>>>
>>> value="logical"
>>>
>>> as well as value=TRUE and value=FALSE.
>>>
>>> This would make boolean operations easy, e.g.
>>>
>>> colors()[grep("dark", colors(), value="logical")
>>>       & !grep("blue", colors(), value="logical")]
>>>
>>> to select the colors that contain "dark" but not "blue". (In this 
>>> case the RE to select that subset is rather simple because "dark" 
>>> always precedes "blue", but if that wasn't true, it would be a lot 
>>> messier.)
>>>
>>> Duncan Murdoch
>> Hi,
>>
>> It sounds like a nice thing to have. I would still prefer to type :
>>
>> R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE), 
>> value = TRUE )  
>
> That's good for intersecting two searches, but not for other boolean 
> combinations.
>
> My main point was that inversion isn't the only boolean operation you 
> may want, but R has perfectly good powerful boolean operators, so 
> installing a limited subset of boolean algebra into grep() is probably 
> the wrong approach.

Hi,

Yes, good point. I agree with you that the value = "logical" is probably 
worth having to take advantage of these logical operators.

.... but, what about all these functions calling grep and passing 
arguments through the ellipsis. With this invert argument, we could do :

R> history(pattern = "grid\\..*\\(", invert = TRUE)

BTW, why not use ... in ls ? in case someone would like to use perl 
regex to use ls, or to get back at this thread, issue commands like :

R> ls("package:grid", pattern = "^grid\\.|Grob$", invert = TRUE)
 [1] "absolute.size"       "applyEdit"           "applyEdits"
 [4] "arcCurvature"        "arrow"               "childNames"
 [7] "convertHeight"       "convertNative"       "convertUnit"
[10] "convertWidth"        "convertX"            "convertY"
[13] "current.transform"   "current.viewport"    "current.vpPath"
[16] "current.vpTree"      "dataViewport"        "downViewport"
[19] "draw.details"        "drawDetails"         "editDetails"
[22] "engine.display.list" "gEdit"               "gEditList"
[25] "get.gpar"            "getNames"            "gList"
[28] "gpar"                "gPath"               "grob"
[31] "grobHeight"          "grobName"            "grobWidth"
[34] "grobX"               "grobY"               "gTree"
[37] "heightDetails"       "is.unit"             "layout.heights"
[40] "layoutRegion"        "layout.torture"      "layout.widths"
[43] "plotViewport"        "pop.viewport"        "popViewport"
[46] "postDrawDetails"     "preDrawDetails"      "push.viewport"
[49] "pushViewport"        "seekViewport"        "setChildren"
[52] "stringHeight"        "stringWidth"         "unit"
[55] "unit.c"              "unit.length"         "unit.pmax"
[58] "unit.pmin"           "unit.rep"            "upViewport"
[61] "validDetails"        "viewport"            "viewport.layout"
[64] "viewport.transform"  "vpList"              "vpPath"
[67] "vpStack"             "vpTree"              "widthDetails"
[70] "xDetails"            "yDetails"

Then, what about ... in apropos ?

Regards,

Romain


>>
>>
>> What about a way to pass more than one regular expression then be 
>> able to call :
>>
>> R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE, 
>> FALSE)
>
> Again, it covers & and !, but it misses other boolean operators.
>
>> I usually use that kind of shortcuts that are easy to remember.
>>
>> vgrep <- function(...) grep(..., value = TRUE)
>> igrep <- function(...) grep(..., invert = TRUE)
>> ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE)
>
> If you're willing to write these, then it's easy to write igrep 
> without an invert arg to grep:
>
> igrep <- function(pat, x, ...)
>    setdiff(1:length(x), grep(pat, x, value = FALSE, ...))
>
> ivgrep would also be easy, except for the weird semantics of 
> value=TRUE pointed out by Brian:  but it could still be written with a 
> little bit of care.
>
> Duncan Murdoch
>
>>
>> What about things like the arguments `after` and `before` in unix 
>> grep. That could be used when grepping inside a function :
>>
>> R> grep("plot\\.", body(plot.default) , value= TRUE)
>> [1] "localWindow <- function(..., col, bg, pch, cex, lty, lwd) 
>> plot.window(...)"
>> [2] "plot.new()"
>> [3] "plot.xy(xy, type, ...)"
>>
>>
>> when this could be useful  (possibly).
>>
>> R> # grep("plot\\.", plot.default, after = 2, value = TRUE)
>> R> tmp <- tempfile(); sink(tmp) ; print(body(plot.default)); sink(); 
>> system( paste( "grep -A2 plot\\. ", tmp) )
>>     localWindow <- function(..., col, bg, pch, cex, lty, lwd) 
>> plot.window(...)
>>     localTitle <- function(..., col, bg, pch, cex, lty, lwd) title(...)
>>     xlabel <- if (!missing(x))
>> -- 
>>     plot.new()
>>     localWindow(xlim, ylim, log, asp, ...)
>>     panel.first
>>     plot.xy(xy, type, ...)
>>     panel.last
>>     if (axes) {
>> -- 
>>     if (frame.plot)
>>         localBox(...)
>>     if (ann)
>>
>>
>> BTW, if I call :
>>
>> R> grep("plot\\.", plot.default)
>> Error in as.character(x) : cannot coerce to vector
>>
>> What about adding that line at the beginning of grep, or something 
>> else to be able to do as.character on a function ?
>>
>> if(is.function(x)) x <- body(x)
>>
>>
>> Cheers,
>>
>> Romain
>>>>
>>>> Cheers,
>>>>
>>>> Romain
>>
>>
>
>


-- 
*mangosolutions*
/data analysis that delivers/

Tel   +44 1249 467 467
Fax   +44 1249 467 468



More information about the R-devel mailing list