[R] greatest common substring

Tue Nov 14 13:19:07 CET 2006

Try:

V1 <- c("Welfare_Group_1024", "Welfare_Group_1536", "Welfare_Group_160")
V2 <- c("xxxWelfare_Group_1024", "yWelfare_Group_1536",
  "zzzzzWelfare_Group_160")
lcs <- function(ff) { L <- ff[1]; for(f in ff) L <- lcs2(f,L); L }
lcs(V1)
lcs(V2)

where lcs2 was posted here:
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/68013.html

There are some obvious optimizations but just for y-axis labelling
you probably don't need them.

On 11/14/06, Jonne Zutt <j.zutt at tudelft.nl> wrote:
> Dear R-members,
>
> Suppose I have a vector with the following strings:
> V = c("Welfare_Group_1024",
>      "Welfare_Group_1536",
>      "Welfare_Group_160")
>
> I want to 'automatically generate a nice y-axis label for this data.
> A good candidate is something close to "Welfare Group".
>
> Is there an easy way to compute something close to the greatest
> common substring?
> It would be nice if it also works in this case:
> V = c("xxxWelfare_Group_1024",
>      "yWelfare_Group_1536",
>      "zzzzzWelfare_Group_160")
>
> Should I iterate through all possible substrings in the first element,
> to see whether this substring is part of all other strings?
> I was hoping for some existing R function :)
>
> Thanks in advance,
> JeeBee.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>