Questions tagged [fuzzy]

DO NOT USE - ambiguous: see fuzzy-search, fuzzy-logic, or image-processing for more appropriate tags.

Do not use this tag. It is ambiguous: see , , or for more appropriate tags.

363 questions
1
vote
1 answer

What is the simplest way to implement fuzzy relational composition of two matrices in R?

What is the simplest way to implement fuzzy relational composition of two matrices in R? I coded a version of it but it's supposedly very slow, so I wonder if there's vectorized operations that can make it faster circ_prod <- function(R,S) { …
monotonic
  • 394
  • 4
  • 20
1
vote
0 answers

Elasticsearch - best query and index for partial and fuzzy search

I thought this scenario must be quite common, but I was unable to find the best way to do it. I have a big dataset of products. All the products have this kind of schema: { "productID": 1, "productName": "Whatever", "productBoost": 1234 }…
user256173
  • 69
  • 1
  • 1
  • 9
1
vote
2 answers

Extract rows from a data frame in R based on fuzzy match string

My string is "Escherichia coli str Nissle 1917" and i want to extract from a df all the rows containing a similar string in a specific column (column organism name), the result should be the following: # assembly_accession bioproject biosample…
Ste40
  • 11
  • 2
1
vote
2 answers

Snowflake - Joining two tables where one table's IDs are delimited by a semicolon

I am working within Snowflake and some of the IDs in a given table and column are delimited by semicolons. Despite this delimiter the tables should still be joined. Any attempt to join the table is usually met with an error of some sort. Below I…
tisaconundrum
  • 2,156
  • 2
  • 22
  • 37
1
vote
1 answer

Fuzzy matching two long character vectors in R

I have two vectors: Candidates$names containing roughly 45.000 names of electoral candidates and Incumbents$names containing roughly 7600 names of members of parliament. I want to check for each of the names in Candidates whether it exists in…
1
vote
0 answers

Faster stringdist_inner_join

I am trying to match company names from two large databases (4000 and 23000). However, it is taking a lot of time. Is there any way to speed up this process? for example, parallelize this code?. I have been searching here but found nothing relevant…
1
vote
0 answers

Is there a way to fuzzy match or provide a score as an assumption of what ID or Group the row value should be associated with?

I have a dataset that looks like this structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), Date = c("2020-01- \n04", "2020-04-03", "2020-12-10", "2020-09-12", "2020-11-19", "2020-04- \n03", "2020-06-03", "2020-05-03", "2020-08-09",…
user35131
  • 1,105
  • 6
  • 18
1
vote
0 answers

Fuzzy Logic: The tipping problem - output never goes past a threshold?

I am new to fuzzy logic with python and have been implementing the skfuzzy library. The first example that I have looked at is the tipping problem, here -> https://pythonhosted.org/scikit-fuzzy/auto_examples/plot_tipping_problem_newapi.html I…
7- Alison
  • 19
  • 6
1
vote
2 answers

Fuzzy matching with long sentence(s)

suppose I have the following dataframe: ID CompanyName JobDescription 1 Green Grass LLC "In the centre of Green Grass area..." 2 Johnny Inc. "Johnny is currently looking for data analist that..." 3 …
teller.py3
  • 822
  • 8
  • 22
1
vote
1 answer

String matching per row of two columns in a dataframe

Say I have a pandas dataframe that looks like this: ID String1 String2 1 The big black wolf The small wolf 2 Close the door on way out door the Close 3 where's the money where…
Mikee
  • 783
  • 1
  • 6
  • 18
1
vote
1 answer

Need to roughly group similar query executions in Oracle that have slightly different constraints

Working on assessing impact of some current databases planned retirement. It's not feasible for individual communication with users that have accessed impacted data recently due to volume. I'm thinking that if I can do some form of fuzzy logic…
1
vote
1 answer

ElasticSearch - Unable To Search Using Fuzzy Match Query For Underscore in value (ES Fuzzy not matching underscore value)

Suppose I have three documents in my elasticsearch. For Ex: 1: { "name": "test_2602" } 2: { "name": "test-2602" } 3: { "name": "test 2602" } Now when I search it using fuzzy match query as given below { "query": { "bool":…
Nishant
  • 13
  • 2
1
vote
1 answer

match_most_similar in Python string_grouper returning original strings

I have a list of strings that are messy, and I want to find, for each one, its best match from a list of cleanly-formatted strings, which also contains metadata about each. The strings in the messy list are repeated randomly through the list…
Tom Clark
  • 11
  • 3
1
vote
0 answers

AFL not taking input from Stdin

I am trying to Fuzz a binary file that takes input from the user(Stdin). When I try Afl-fuzz and then my binary something like afl-fuzz a.out It asks for the required parameters that are specifying the input and output directories. afl-fuzz -i…
Obaid Ur Rehman
  • 324
  • 2
  • 15
1
vote
2 answers

Select Rows from Different Table where String from 1st table column is present in R

I am trying to match tables if a string is fully present in the other tables' column. However, I have managed to join it partially and then I am applying Levenstein distance to get close matches. This approach has limited use and accuracy.…
marine8115
  • 588
  • 3
  • 22