A suffix tree is a data structure that stores all suffixes of a string. It is the basis for many fast algorithms on strings.
Questions tagged [suffix-tree]
228 questions
8
votes
1 answer
substring finding from a string
Input: string S = AAGATATGATAGGAT.
Output: Maximal repeats such as GATA (as in positions 3 and 8), GAT (as in position 3, 8 and 13) and so on...
A maximal repeat is a substring t occurs k>1 times in S, and if t is extended to left or right, it will…

rock
- 153
- 8
8
votes
4 answers
Suffix trees in javascript?
Is there a nice implementation of suffix trees in JavaScript? Something that will take a string (and a separator) and make the appropriate suffix tree?

silverasm
- 501
- 5
- 10
8
votes
1 answer
Matches overlapping lookahead on LZ77/LZSS with suffix trees
Background: I have an implementation of a generic LZSS backend on C++ (available here. The matching algorithm I use in this version is exceedingly simple, because it was originally meant to compress relatively small files (at most 64kB) for…

flamewing
- 89
- 7
8
votes
3 answers
Short, Java implementation of a suffix tree and usage?
I'm looking for a short, simple suffix tree building/usage algorithm in Java. The best I've found so far lies withing the Semantic Discovery Toolkit, but the implementation is several thousand lines long and spans several classes. Ideally, the…

Stefan Kendall
- 66,414
- 68
- 253
- 406
8
votes
9 answers
Efficient String/Pattern Matching in C++ (suffixarray, trie, suffixtree?)
I'm looking for an efficient data structure to do String/Pattern Matching on an really huge set of strings. I've found out about tries, suffix-trees and suffix-arrays. But I couldn't find an ready-to-use implementation in C/C++ so far (and…

Constantin
- 8,721
- 13
- 75
- 126
7
votes
4 answers
Optimizing construction of a trie over all substrings
I am solving a trie related problem. There is a set of strings S. I have to create a trie over all substrings for each string in S. I am using the following routine:
String strings[] = { ... }; // array containing all strings
for(int i = 0; i <…

Bhoot
- 2,614
- 1
- 19
- 36
7
votes
2 answers
How to remove substring from suffix tree?
I reviewed a lot of literature, but I dont found any information about deleting or insertion substrings into suffix tree. There are only Ukkonen's or McCreight's algorithms for building tree.
The poorest way is to rebuild tree after deleting or…

user2386656
- 71
- 3
6
votes
0 answers
Stream variant of the Longest palindromic substring
Suppose I have a character stream as my input.
What is the most optimal way to find the longest palindromic
substring after each new character is added without reprocessing
the whole string all over again?
After each new character comes in, I…
user78706
6
votes
0 answers
Haskell Data Type With References
I'm implementing Ukkonen's algorithm, which requires that all leaves of a tree contain a reference to the same integer, and I'm doing it in Haskell to learn more about the language. However, I'm having a hard time writing out a data type that does…

Craig
- 255
- 1
- 6
6
votes
1 answer
Find longest common substring of multiple strings using factor oracle enhanced with LRS array
Can we use a factor-oracle with suffix link (paper here) to compute the longest common substring of multiple strings? Here, substring means any part of the original string. For example "abc" is the substring of "ffabcgg", while "abg" is not.
I've…

Ray
- 1,647
- 13
- 16
6
votes
1 answer
Can I generate all substrings in complexity < O(n^2)
Currently I am using two nested for loop to generate all the substrings of a string. I heard about Suffix Tree but AFAIK Suffix Tree generates suffix not the substrings. Following is the code which currently i am using-
String s =…

ravi
- 6,140
- 18
- 77
- 154
6
votes
4 answers
Working with suffix trees in python
I'm relatively new to python and am starting to work with suffix trees. I can build them, but I'm running into a memory issue when the string gets large. I know that they can be used to work with DNA strings of size 4^10 or 4^12, but whenever I…

doggysaywhat
- 177
- 1
- 2
- 6
5
votes
1 answer
Finding all common, non-overlapping substrings
Given two strings, I would like to identify all common sub-strings from longest to shortest.
I want to remove any "sub-"sub-strings. As an example, any substrings of '1234' would not be included in the match between '12345' and '51234'.
string1 =…

mrmagicfluffyman
- 365
- 1
- 2
- 7
5
votes
1 answer
How is worst case time complexity of constructing suffix tree linear?
I have trouble understanding how the worst case time complexity of constructing a suffix tree is linear - particularly when we need to build a suffix tree for a string that may be composed of repeating single character such as "aaaaa".
Even if I…

sia831
- 51
- 3
5
votes
4 answers
How to speed up calculation of length of longest common substring?
I have two very large strings and I am trying to find out their Longest Common Substring.
One way is using suffix trees (supposed to have a very good complexity, though a complex implementation), and the another is the dynamic programming method…

Lazer
- 90,700
- 113
- 281
- 364