Questions tagged [string-matching]

String matching is the problem of finding occurrences of one string (“pattern”, “needle”) in another (“text”, “haystack”).

There are two types of string matching:

  • Exact
  • Approximate

Exact string matching is the problem of finding occurrence(s) of a pattern string within another string or body of text. (NIST). For example, finding CGATCGATTA in CTAGATCCTGCGATCGATTAAGCCTGA.

A comprehensive online reference of string matching algorithms is Exact String Matching Algorithms by Christian Charras and Thierry Lecroq.

Approximate string matching, also called fuzzy string matching, searches for matches based on the edit distance between the pattern and the text.

2278 questions
17
votes
5 answers

One of strings in array to match an expression

The Problem: I have an array of promises which is resolved to an array of strings. Now the test should pass if at least one of the strings matches a regular expression. Currently, I solve it using simple string…
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
17
votes
4 answers

Detect position of first difference in 2 strings

What is the cleanest way of finding the position of the first difference in any two strings in Javascript? var a = 'in the'; var b = 'in he'; findFirstDiffPos(a, b); // 3 var c = 'in the beginning'; findFirstDiffPos(a, c); // 6
galki
  • 8,149
  • 7
  • 50
  • 62
17
votes
3 answers

Delete to end of line after a match, keep lines not matched

I have a file of the following form: interesting text-MIB blah blah blah VERY INTERESTING TEXT interesting text-MIB blah blah blah In each line containing the "-MIB" string, I would like to delete the text following this string, until the end of…
Youri_Margarine
  • 173
  • 1
  • 1
  • 6
17
votes
2 answers

Using Rabin-Karp to search for multiple patterns in a string

According to the wikipedia entry on Rabin-Karp string matching algorithm, it can be used to look for several different patterns in a string at the same time while still maintaining linear complexity. It is clear that this is easily done when all the…
MAK
  • 26,140
  • 11
  • 55
  • 86
16
votes
7 answers

Returning the lowest index for the first non whitespace character in a string in Python

What's the shortest way to do this in Python? string = " xyz" must return index = 3
Pablo
  • 4,821
  • 12
  • 52
  • 82
16
votes
2 answers

Approximate substring matching using a Suffix Tree

This article discusses approximate substring matching techniques that utilize a suffix tree to improve matching time. Each answer addresses a different algorithm. Approximate substring matching attempts to find a substring (pattern) P in a string T…
Pooven
  • 1,744
  • 1
  • 25
  • 44
15
votes
1 answer

tsql last "occurrence of" inside a string

I have got field containing comma separated values. I need to extract the last element in the list. I have tried with this: select list_field, LTRIM(RTRIM(right(list_field, len(list_field) - CHARINDEX(',',list_field)))) But it returns the last part…
Alberto De Caro
  • 5,147
  • 9
  • 47
  • 73
15
votes
2 answers

python - regex search and findall

I need to find all matches in a string for a given regex. I've been using findall() to do that until I came across a case where it wasn't doing what I expected. For example: regex = re.compile('(\d+,?)+') s = 'There are 9,000,000 bicycles in…
armandino
  • 17,625
  • 17
  • 69
  • 81
15
votes
4 answers

Fastest way to test two strings for exact match in JavaScript

I want to compare two strings in JavaScript to test if they are exactly the same. Which would be the best (fastest) way to do this? Right now, I'm considering either if(string1.localeCompare(string2) == 0) {} or simply if(string1 == string2) Is…
atreju
  • 965
  • 6
  • 15
  • 36
14
votes
5 answers

Check if a string is a possible abbrevation for a name

I'm trying to develop a python algorithm to check if a string could be an abbrevation for another word. For example fck is a match for fc kopenhavn because it matches the first characters of the word. fhk would not match. fco should not match fc…
Björn Lindqvist
  • 19,221
  • 20
  • 87
  • 122
14
votes
1 answer

Search a particular string in a vector(Octave)

I am trying to find a string in a vector. For Eg:query = "ab" in vector = ["ab", "cd", "abc", "cab"] The problem is: It is giving all the indices which have string "ab" when I use the function strfind(vector,query). In this case "ab" including "abc"…
user3713665
  • 147
  • 1
  • 7
14
votes
2 answers

Check if a string ends with a suffix in Emacs Lisp

Is there a function that checks that a string ends with a certain substring? Python has endswith: >>> "victory".endswith("tory") True
Mirzhan Irkegulov
  • 17,660
  • 12
  • 105
  • 166
13
votes
3 answers

String matching objective-c

I need to match my string in this way: *myString* where * mean any substring. which method should I use? can you help me, please?
Dany
  • 2,290
  • 8
  • 35
  • 56
13
votes
6 answers

R: Replacing foreign characters in a string

I'm dealing with a large amount of data, mostly names with non-English characters. My goal is to match these names against some information on them collected in the USA. ie, I might want to match the name 'Sølvsten' (from some list of names) to…
krishnan
  • 671
  • 1
  • 10
  • 21