Questions tagged [suffix-tree]

A suffix tree is a data structure that stores all suffixes of a string. It is the basis for many fast algorithms on strings.

228 questions
14
votes
8 answers

Longest Non-Overlapping Repeated Substring using Suffix Tree/Array (Algorithm Only)

I need to find the longest non-overlapping repeated substring in a String. I have the suffix tree and suffix array of the string available. When overlapping is allowed, the answer is trivial (deepest parent node in suffix tree). For example for…
14
votes
7 answers

Finding the longest repeated substring

What would be the best approach (performance-wise) in solving this problem? I was recommended to use suffix trees. Is this the best approach?
kukit
  • 307
  • 1
  • 3
  • 8
13
votes
2 answers

Ukkonen's algorithm for Generalized Suffix Trees

I am currently working on my own Suffix Tree implementation (using C++, but the question remains language agnostic). I studied the original paper from Ukkonen. The article is very clear so I got to work on my implementation and tried to tackle the…
Rerito
  • 5,886
  • 21
  • 47
11
votes
4 answers

Effcient way to find longest duplicate string for Python (From Programming Pearls)

From Section 15.2 of Programming Pearls The C codes can be viewed here: http://www.cs.bell-labs.com/cm/cs/pearls/longdup.c When I implement it in Python using suffix-array: example = open("iliad10.txt").read() def comlen(p, q): i = 0 for x…
Hanfei Sun
  • 45,281
  • 39
  • 129
  • 237
10
votes
1 answer

suffix tree implementation in python

Just wondering if you are aware of any C based extension in python that can help me construct suffix trees/arrays in linear time ?
Abhi
  • 6,075
  • 10
  • 41
  • 55
10
votes
1 answer

Kasai Algorithm for Constructing LCP-Array Practical Example

I am attempting to complete the Algorithm's on Strings course on Coursera and am stuck on the method to construct an LCP array described in this video: https://www.coursera.org/learn/algorithms-on-strings/lecture/HyUlH/computing-the-lcp-array I am…
10
votes
1 answer

Finding the Longest Common Substring in a Large Data Set

In the past few days I've researched this extensively, I've read so many things that I am now more confused then ever. How does one find the longest common sub string in a large data set? The idea is to remove duplicate content from this data set…
diffuse
  • 101
  • 1
  • 3
10
votes
1 answer

how to get longest repeating string in substring from suffix tree

I need to find the longest repeating string in substring. Let's say I have string "bannana" Wikipedia says following: In computer science, the longest repeated substring problem is the problem of finding the longest substring of a string that…
Wakan Tanka
  • 7,542
  • 16
  • 69
  • 122
10
votes
1 answer

Generating suffix tree of string S[2..m] from suffix tree of string S[1..m]

Is there a fast (O(1) time complexity) way of generating a suffix tree of string S[2..m] from suffix tree of string S[1..m]? I am familiar with Ukkonen's, so I know how to make fast suffix tree of string S[1..m+1] from suffix tree of string…
9
votes
2 answers

How to use a Trie data structure to find the sum of LCPs for all possible substrings?

Problem Description: References: Fun With Strings Based on the problem description, a naive approach to find sum of length of LCP for all possible substrings (for a given string) is as follows : #include #include using…
Saurabh P Bhandari
  • 6,014
  • 1
  • 19
  • 50
9
votes
1 answer

Finding all repeated substrings in a string and how often they appear

Problem: I need all the sequences of characters that meet the following: Sequence of characters must be present more than once ((LE, 1) is thus invalid). Sequence of characters must be longer than one character ((M, 2) is thus invalid). Sequence…
alvitawa
  • 394
  • 1
  • 4
  • 12
9
votes
2 answers

Is it possible to count the number of distinct substrings in a string in O(n)?

Given a string s of length n, is it possible to count the number of distinct substrings in s in O(n)? Example Input: abb Output: 5 ('abb', 'ab', 'bb', 'a', 'b') I have done some research but i can't seem to find an algorithm that solves this problem…
donrondon
  • 103
  • 1
  • 1
  • 5
9
votes
3 answers

Successive adding of char to get the longest word in the dictionary

Given a dictionary of words and an initial character. find the longest possible word in the dictionary by successively adding a character to the word. At any given instance the word should be valid word in the dictionary. ex : a -> at -> cat -> cart…
AlgoMan
  • 2,785
  • 6
  • 34
  • 40
9
votes
1 answer

Maximum and minimum number of nodes in a suffix tree

What are the maximum and minimum number of nodes in a suffix tree? And how can I prove it?
user1819636
8
votes
6 answers

Given a string, find all its permutations that are a word in dictionary

This is an interview question: Given a string, find all its permutations that are a word in dictionary. My solution: Put all words of the dictionary into a suffix tree and then search each permutation of the string in the tree. The search time…
user1002288
  • 4,860
  • 10
  • 50
  • 78
1
2
3
15 16