[R] string pattern matching

William Dunlap wdunlap at tibco.com
Thu Mar 23 16:48:29 CET 2017


If you are trying to see if one model nested in another then I think
looking at the 'term.labels' attribute of terms(formula) is the way to
go.  Most formula-based modelling functions store the output of
terms(formula) in their output and many supply a method for the terms
function that extracts that from their output.
Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Thu, Mar 23, 2017 at 6:37 AM, Joe Ceradini <joeceradini at gmail.com> wrote:
> Thanks for the additional response, Bill. I did not want to bog down
> the question with the full context of the function. Briefly, given a
> set of nested and non-nested regression models, I want to compare AIC
> (bigger model - smaller model) and do an LRT for all the nested models
> that differ by a single predictor. All models, nested or not, would
> also have an AIC value (I am aware of the critiques of mixing p-value
> hypothesis testing and information criteria). So, not quite
> MuMIn::dredge. The tricky part, for me, has been doing the comparisons
> for only the nested models in a set that contains nested and
> non-nested. I made some progress with the function, so I'll refrain
> from bugging the list with the whole thing unless (when) I'm stuck
> again.
>
> For those interested in the motivation, I'm running with the idea of
> trying to flag uninformative parameters which "steal" AIC model
> weight, and potentially result in a misleading model set, depending
> how the reader interprets the set.
> Arnold, T. W. 2010. Uninformative parameters and model selection using
> Akaike’s information criterion. Journal of Wildlife Management
> 74:1175–1178.
> Murtaugh, P. 2014. In defense of P values. Ecology 95:611–617.
>
> Joe
>
> On Wed, Mar 22, 2017 at 9:11 AM, William Dunlap <wdunlap at tibco.com> wrote:
>> You did not describe the goal of your pattern matching.  Were you trying
>> to match any string that could be interpreted as an R expression containing
>> X1 and X3 as additive terms?   If so, you could turn the string into a one-sided
>> formula and use the terms() function.  E.g.,
>>
>> f <- function(string) {
>>     fmla <- as.formula(paste("~", string))
>>     term.labels <- attr(terms(fmla), "term.labels")
>>     all(c("X1","X3") %in% term.labels)
>> }
>>
>>> f("X3 + X2 + X1")
>> [1] TRUE
>>> f("- X3 + X2 + X1")
>> [1] FALSE
>>> f("X3 + X2 + log(X1)")
>> [1] FALSE
>>> f("X3 + X2 + log(X1) + X1")
>> [1] TRUE
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>>
>> On Wed, Mar 22, 2017 at 6:39 AM, Joe Ceradini <joeceradini at gmail.com> wrote:
>>> Wow. Thanks to everyone (Jim, Ng Bo Lin, Bert, David, and Ulrik) for
>>> all the quick and helpful responses. They have given me a better
>>> understanding of regular expressions, and certainly answered my
>>> question.
>>>
>>> Joe
>>>
>>> On Wed, Mar 22, 2017 at 12:22 AM, Ulrik Stervbo <ulrik.stervbo at gmail.com> wrote:
>>>> Hi Joe,
>>>>
>>>> you could also rethink your pattern:
>>>>
>>>> grep("x1 \\+ x2", test, value = TRUE)
>>>>
>>>> grep("x1 \\+ x", test, value = TRUE)
>>>>
>>>> grep("x1 \\+ x[0-9]", test, value = TRUE)
>>>>
>>>> HTH
>>>> Ulrik
>>>>
>>>> On Wed, 22 Mar 2017 at 02:10 Jim Lemon <drjimlemon at gmail.com> wrote:
>>>>>
>>>>> Hi Joe,
>>>>> This may help you:
>>>>>
>>>>> test <- c("x1", "x2", "x3", "x1 + x2 + x3")
>>>>> multigrep<-function(x1,x2) {
>>>>>  xbits<-unlist(strsplit(x1," "))
>>>>>  nbits<-length(xbits)
>>>>>  xans<-rep(FALSE,nbits)
>>>>>  for(i in 1:nbits) if(length(grep(xbits[i],x2))) xans[i]<-TRUE
>>>>>  return(all(xans))
>>>>> }
>>>>> multigrep("x1 + x3","x1 + x2 + x3")
>>>>> [1] TRUE
>>>>> multigrep("x1 + x4","x1 + x2 + x3")
>>>>> [1] FALSE
>>>>>
>>>>> Jim
>>>>>
>>>>> On Wed, Mar 22, 2017 at 10:50 AM, Joe Ceradini <joeceradini at gmail.com>
>>>>> wrote:
>>>>> > Hi Folks,
>>>>> >
>>>>> > Is there a way to find "x1 + x2 + x3" given "x1 + x3" as the pattern?
>>>>> > Or is that a ridiculous question, since I'm trying to find something
>>>>> > based on a pattern that doesn't exist?
>>>>> >
>>>>> > test <- c("x1", "x2", "x3", "x1 + x2 + x3")
>>>>> > test
>>>>> > [1] "x1"           "x2"           "x3"           "x1 + x2 + x3"
>>>>> >
>>>>> > grep("x1 + x2", test, fixed=TRUE, value = TRUE)
>>>>> > [1] "x1 + x2 + x3"
>>>>> >
>>>>> >
>>>>> > But what if only have "x1 + x3" as the pattern and still want to
>>>>> > return "x1 + x2 + x3"?
>>>>> >
>>>>> > grep("x1 + x3", test, fixed=TRUE, value = TRUE)
>>>>> > character(0)
>>>>> >
>>>>> > I'm sure this looks like an odd question. I'm trying to build a
>>>>> > function and stuck on this. Rather than dropping the whole function on
>>>>> > the list, I thought I'd try one piece I needed help with...although I
>>>>> > suspect that this question itself probably does bode well for my
>>>>> > function :)
>>>>> >
>>>>> > Thanks!
>>>>> > Joe
>>>>> >
>>>>> > ______________________________________________
>>>>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> > PLEASE do read the posting guide
>>>>> > http://www.R-project.org/posting-guide.html
>>>>> > and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>> --
>>> Cooperative Fish and Wildlife Research Unit
>>> Zoology and Physiology Dept.
>>> University of Wyoming
>>> JoeCeradini at gmail.com / 914.707.8506
>>> wyocoopunit.org
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Cooperative Fish and Wildlife Research Unit
> Zoology and Physiology Dept.
> University of Wyoming
> JoeCeradini at gmail.com / 914.707.8506
> wyocoopunit.org



More information about the R-help mailing list