[R] Making a table: collapsing across sub-strings

Dieter Vanderelst dieter_vanderelst at emailengine.org
Wed Oct 3 17:25:10 CEST 2007


Hi list,

I'm currently processing textual data and I would really appreciate some
help with one off my problems.

I have a set of strings and I want to count how often each of this
strings appears in this set.

This is not very difficult and can be done as:

TB<-table(my_set)
plot(TB)

However, I also want to collapse across sub-strings. This is, I want a
sub-string ss of string S to be counted as an occurrence of string S.

So, 'abab' should be included in the count of 'ababaaa' and should not
be listed as a separate entry in the frequency table.

Does somebody has a pointer to a way to do this? I have been checking
out the CRAN packages for handling DNA sequences, but this has not
really brought me closer to a solution.

Thanks,
Dieter Vanderelst

------------------------------------------
Dieter Vanderelst
Eindhoven University of Technology
Faculty of Industrial Design
Designed Intelligence Group
Den Dolech 2
5612 AZ Eindhoven
The Netherlands
Tel +31 40 247 91 11



More information about the R-help mailing list