Questions tagged [string-matching]

String matching is the problem of finding occurrences of one string (“pattern”, “needle”) in another (“text”, “haystack”).

There are two types of string matching:

  • Exact
  • Approximate

Exact string matching is the problem of finding occurrence(s) of a pattern string within another string or body of text. (NIST). For example, finding CGATCGATTA in CTAGATCCTGCGATCGATTAAGCCTGA.

A comprehensive online reference of string matching algorithms is Exact String Matching Algorithms by Christian Charras and Thierry Lecroq.

Approximate string matching, also called fuzzy string matching, searches for matches based on the edit distance between the pattern and the text.

2278 questions
0
votes
2 answers

Checking if elements of an array are included in elements of another array

I'm trying to match 2 arrays and check if all elements of the first array are included in the second one. My first array looks like this and it contains a series of selected ingredients selectedOptions: Array [ "egg", "spaghetti", …
Butri
  • 339
  • 8
  • 22
0
votes
1 answer

Calculate % of matched records from two column using Pandas

I need pandas code to calculate % of matched records. Suppose I have two column Hotel_name and Property_name and total records is 100 and 30 records matched from both the column, then % matched records should be 30%.
0
votes
1 answer

How to map display sizes to standard sizes in a pandas dataframe column?

I understand the title might not be very clear, but please hear me out. I have a pandas dataframe column of ~850 unique display sizes, for e.g. 1 320x480 2 480x320 3 382x215 4 676x320 5 694x320 6 1080x2123 7 2094x1080 8 1080x2020 I…
lightyagami96
  • 336
  • 1
  • 4
  • 14
0
votes
0 answers

How can I perform the following transformation to the given dataset?

I am a beginnering in Python and I don't have an idea how I could perform the following transformation and analysis. I have a file that looks similar to the following example: RecordID Comments name …
0
votes
1 answer

How do I match elements of an array to a given string using regex?

I have this Java code method which compares the elements of an array of strings with a string variable. This method requires two argument: an argument of type string haystack and an array of strings needles. If the length of the needles array is…
Caleb Oki
  • 667
  • 2
  • 11
  • 29
0
votes
3 answers

PHP Query - Check string character

if (substr('xcazasd123', 0, 2) === 'ax'){ } Above code is working where it able to check if the "variable" is starting with 'ax', but what if i wanted to check "multiple" different validation ? for example : 'ax','ab','ac' ? Without creating…
greenboxgoolu
  • 129
  • 2
  • 18
0
votes
1 answer

Is there a way to check if two strings are almost identical

So I am scraping 3 websites for their product's data, the websites are all big chains of supermarkets in my region, since all the supermarkets are in the same region they usually sell the same products. I want to make one curated collection…
0
votes
1 answer

Extract a dataframe from a list of dataframes containing a substring

I have the following dataframes in python that are part of a list dataframe_list= []## CREATE AN EMPTY LIST import pandas as pd A=pd.DataFrame() A["name"]=["A", "A", "A"] A["att"]=["New World", "Hello", "Big Day…
Raghavan vmvs
  • 1,213
  • 1
  • 10
  • 29
0
votes
1 answer

How to use a variable result in a match/select-string - Powershell

I am looking through logfiles for which client holds an application session. This isn't logged clearly, so you have to first find the client number getting updates, then check the log again for which client was assigned the number. I cannot complete…
Hargaut
  • 43
  • 5
0
votes
1 answer

ERROR: syntax error at or near "INTEGER" LINE 2: v_stage INTEGER:=0;

DECLARE v_stage INTEGER:=0; [...] RETURN QUERY SELECT 1.0::FLOAT, v_stage, sex, birthdate, place, district, subdistrict, village, race, complexion, eyecolor, haircolor, height, weight, hp, mother, father, picture, sidenote, …
0
votes
1 answer

Regex - exclude word within string

I want to match everything except “eps1@“ within this string: HelpMe_pleas3_eps1@ I tried [^eps1@] but this matched everything except every instance of “e” “p” “s” “1” “@“. I just want “eps1@“ excluded.
TR79
  • 19
  • 2
0
votes
1 answer

Fuzzywuzzy match 2 columns... script keeps running

I'm trying to match 2 columns of ~50.000 instances with Fuzzywuzzy. Column A (companies) contains company names, with some typos. Column B (correct) contains the correct company names. I'm trying to match the typo ones with correct ones. When…
user34624
  • 1
  • 1
0
votes
2 answers

String matching and manipulating in R

I make progress in cleaning data like this: df1 <- data.frame(ID=(c("18.1010-2.570322","171114-238509","140808-3481906 ","18055656193","180625-378224","190903-2793831 / -9311442 / -6810125","190808-625-6692","190 807 -…
0
votes
1 answer

Adding a dash to a string

i think I have a simple question, but I did not get it. I have something like this: df <- data.frame(identifier = c("9562231945200505501901190109-5405303 ", "190109-8731478", "1901098260031", " .9..43675190109-3690341",…
0
votes
1 answer

Is there an alternative for Dart's splitMapJoin in JavaScript or TypeScript?

I have a function in Dart which uses a regular expression to process matched and non-matched parts of the string differently. Future> _getQuestionParts(String questionText) async { List parts =…