Questions tagged [suffix-tree]

A suffix tree is a data structure that stores all suffixes of a string. It is the basis for many fast algorithms on strings.

228 questions
3
votes
1 answer

Search in implicit suffix tree constructed by Ukkonen algorithm

I encountered a problem which requires a data structure that will hold a string S and allow me to: Check if word W is a subword of S in O(|W|) time Find longest suffix of S that is also a prefix of given word U in O(|U|) time Append string K at the…
KCH
  • 2,794
  • 2
  • 23
  • 22
3
votes
4 answers

Stuck finding deepest path in general tree traversal trying to find largest common substring

I am trying to solve the problem of largest common substring between 2 Strings. I will reduce my problem to the following: I created a general suffix tree and as per my understanding the largest common substring is the deepest path consisting of…
Cratylus
  • 52,998
  • 69
  • 209
  • 339
3
votes
3 answers

Can we use circular strings with Suffix Trees?

Can we use circular strings with Suffix Trees? So the last character is followed by the first in the list. If so, how is the representation of this suffix tree different from a normal suffix tree?
user1819636
2
votes
2 answers

suffix tree construction

i am going to implement suffix tree for given string, i think it should delcared like this struct suffix { char letter; suffix * left,*right; }; suffix *insert(suffix *node,char *s){ } //here i am going to construct tree with all…
user466534
2
votes
2 answers

Ukkonen algorithm in C++

Is there an implementation of the Ukkonen's algorithm for building Suffix Tree in C++? Any implementation in a high level language is good too.
shreyasva
  • 13,126
  • 25
  • 78
  • 101
2
votes
2 answers

Where would a suffix array be preferable to a suffix tree?

Two closely-related data structures are the suffix tree and suffix array. From what I've read, the suffix tree is faster, more powerful, more flexible, and more memory-efficient than a suffix array. However, in this earlier question, one of the…
templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
2
votes
0 answers

Given a list of strings find each of its string's closest match (edit distance) in another big list of strings

I have a list of strings small_list = ['string1', 'this is string 2', ...] and a larger list of strings big_list = ['is string 2', 'some other string 3', 'string 1', ...]. I want to find the string that is closest by edit distance for all of the…
user281989
  • 47
  • 7
2
votes
2 answers

Find K-most longest common suffix in a set of strings

I want to find most longest common suffix in a set of strings to detect some potential important morpheme in my natural language process project. Given frequency K>=2,find the K-most common longest suffix in a list of strings S1,S2,S3...SN To…
ken wang
  • 165
  • 1
  • 12
2
votes
1 answer

String matching with an implicit representation of a suffix tree

From Data Structures and Algorithm Analysis in Java, Weiss: Weiss writes: In the leaves, we use the index where the suffix begins (as in the suffix array) In the internal nodes, we store the number of common characters matched from the root until…
ivme
  • 548
  • 5
  • 14
2
votes
0 answers

Printing Suffix Trees Edges in Python

I was looking through the code written by Ben Langmead on SuffixTrees. I am having a hard time figuring out how to print all the edges of the suffix tree. What is a way to store them in a set and save it in the object class? class…
oidanioi
  • 21
  • 2
2
votes
3 answers

Find all repeating patterns in paragraph

I have a problem at hand where I have to find all repeating patterns that exist inside a sentence. Example : 'camel horse game camel horse gym camel horse game' # This is the sanitized string as I will cleanup anything other than words before…
2
votes
2 answers

Finding the longest double suffix in linear time

Given a string s, find the longest double suffix in time complexity O(|s|). Example: for string banana, the LDS is na. For abaabaa it's baa. Obviously I thought about using a suffix tree, but I'm having trouble to find double suffix in it.
Xtreme Joe
  • 115
  • 6
2
votes
1 answer

Inserting new string to generalized suffix tree in linear time

If I have a generalized suffix tree and I want to insert a new string of length m, is it possible in O(m)? (and the total lengths in the tree are M >> m)
Xtreme Joe
  • 115
  • 6
2
votes
2 answers

Suffix Trie in C++

I have been trying to write a C++ code of a suffix trie however I want this code to keep track of counters at each node of how often a character or substring appears during the suffix trie construction: bearing in mind that am working with only 4…
perfecto
  • 63
  • 5
2
votes
0 answers

Find Substring of Trie Keys

This seems like it should be a common problem yet I can't seem to find anything that exactly fits my needs. I have 2 sequence(fasta) files. One is an assembly of the other, so I would like to make a trie or suffix tree out of the assembled…
Malonge
  • 1,980
  • 5
  • 23
  • 33