A suffix tree is a data structure that stores all suffixes of a string. It is the basis for many fast algorithms on strings.
Questions tagged [suffix-tree]
228 questions
1
vote
1 answer
Generating suffixes from a Suffix Tree
I've built a suffix tree in Java based on the site here http://marknelson.us/1996/08/01/suffix-trees/ but I've run into a problem. I can build a suffix tree fine but I can trying to build a set of all suffixes from the tree. I basically find all the…

Justin
- 4,196
- 4
- 24
- 48
0
votes
1 answer
Concurrent insertions in a Suffix Tree
Some time ago I posted a question about saving/retrieving a Suffix Tree from disk. That's finally working fine, but now the construction is extremely slow, and I don't want to mess with the Ukkonen's algorithm (linear construction) right now.
So, I…

juliomalegria
- 24,229
- 14
- 73
- 89
0
votes
1 answer
What should I read to understand suffix trees?
I've come to understand that suffix trees are excellent and useful structures for a multitude of string related tasks, and I would like to learn more about them. Can anyone suggest a good starting point for UNDERSTANDING these things? That is, I…

Svein Bringsli
- 5,640
- 7
- 41
- 73
0
votes
1 answer
Practical implementation of suffix array
Looking for a practical implementation of suffix arrays, I came across this paper. It outlines a O(n (log n * log n)) approach, where n is the length of the string. While there are faster algorithms available, IMO, none is suitable in a programming…

Abhijit Sarkar
- 21,927
- 20
- 110
- 219
0
votes
0 answers
Finding FIRST occurrence of a substring using suffix trees
A suffix tree is an efficient data structure containing all suffixes of a given string.
Suffix trees support operations such as checking if a given substring exists in the string and returning all occurrences of a substring. I was wondering if it is…

Shaharg
- 971
- 1
- 11
- 26
0
votes
0 answers
How to solve longest already present substring in O(n)?
Given a string a I need to find for every position i in a the length of the longest substring b such that it starts in position i and was already present in a, which means that there exists i'

quicker
- 1
- 1
0
votes
0 answers
Data structure for determining all strings that contain a given substring
Let's suppose I have a dynamic list of strings, and I have a substring s. What data structure would be best for determining all possible strings in my list that contain the substring s?
I was thinking of using a suffix tree/array but those don't…

Shishir Oneal
- 13
- 6
0
votes
0 answers
Serialize SuffuxTree python
I am using the suffix-tree library:
https://pypi.org/project/suffix-tree/
tree = Tree()
for item_id, item in tqdm.tqdm(enumerate(items)):
tree.add(item_id, item.lower())
I want to save a tree into a file
pickle.dump(tree, open('test.pkl',…

Not Found
- 11
- 2
0
votes
1 answer
Debugging a pattern-matching algorithm
The user provides a text file to be searched and a pattern to search for. The program builds a suffix tree and uses it to find all occurrences of the pattern in text, then prints their indexes.
class Node:
def __init__(self, start, substr):
…

Tsidia
- 3
- 2
0
votes
0 answers
Suffix Tree - All common substrings
The problem is as following:
Given 2 strings X and Y, I want to find the all (longest) common substrings, hence all substrings that appear in X and in Y and are maximal. for instance - if X = gttcatwg, Y = twgacgtt.
return gtt and twg, not…

Abdullah Garra
- 1
- 1
0
votes
0 answers
How to Create Suffix Tree From String?
I want to create a suffix tree from a given string. This is what I have came up with until now. Although some of the nodes are correctly added, some are missing. I suspect that my add_node_list is not working correctly but I can't find the reason…

Steven
- 3
- 2
0
votes
1 answer
Heftiest repeated substring
I am looking for naming/literature/implementations for a variation on the longest repeated substring problem. In the cited problem you find the longest (consecutive) substring with at least 2 (non-overlapping) repetitions:
max len(s) | rep(s) > 1
In…

o17t H1H' S'k
- 2,541
- 5
- 31
- 52
0
votes
2 answers
Algorithm to find all duplicate sequences of tokens in a long string
Let's say I have a really long string consists of 10^6 tokens (for simplicity, token is a space-separated word, so this string is splitted to list of tokens)
now I need to find all possible duplicated sequences and the start of the duplication…

Izik
- 746
- 1
- 9
- 25
0
votes
2 answers
Java Suffix Trie exceeding heap space
I am implementing a suffix trie (this is different from a suffix tree) that stores the characters suffixes of strings as nodes in a tree structure where a string is made up by following traversing the tree until you hit a '$' or you hit the end of…

Jonno_FTW
- 8,601
- 7
- 58
- 90