Can Cypher do phonetic text search with only a part of the text, without using elastic search?

Question

Say I have a job as financial administrator (j:Job {name: 'financial administrator'}).

Many people use different titles for a 'financial administrator'. Therefore, I want abovementioned job as a hit, even if people type only 'financial' or 'administrator' and their input has typos (like: 'fynancial').

CONTAINS only gives results when the match is 100% - so without typos.

Thanks a lot!

we are not yet there, but we will get there. – jose_bacoy Mar 03 '21 at 18:44 — jose_bacoy, Mar 03 '21 at 18:44

score 0 · Answer 1 · answered Mar 06 '21 at 14:10

First, you could try fuzzy matching with a full text index and see if it solves the issue. An example would be: Set up the index- CALL db.index.fulltext.createNodeIndex('jobs', ['Job'], ['name'], {})

Query the index with fuzzy matching (note the ~)

CALL db.index.fulltext.queryNodes('jobs', 'fynancial~')

If you want to go further and use Lucene's phonetic searches, then you could write a little Java code to register a custom analyzer.

Include the lucene-analyzers-phonetic dependency like so:

     <dependency>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-analyzers-phonetic</artifactId>
            <version>8.5.1</version>
        </dependency>

Then create a custom analyzer:

@ServiceProvider
public class PhoneticAnalyzer extends AnalyzerProvider {


    public PhoneticAnalyzer() {
        super("phonetic");
    }

    @Override
    public Analyzer createAnalyzer() {
        return new Analyzer() {
            @Override
            protected TokenStreamComponents createComponents(String s) {
                Tokenizer tokenizer = new StandardTokenizer();
                TokenStream stream = new DoubleMetaphoneFilter(tokenizer, 6, true);
                return new TokenStreamComponents(tokenizer, stream);
            }
        };
    }
}

I used the DoubleMetaphoneFilter but you can experiment with others. Package it as a jar, and put it into Neo4j's plugin directory along with the Lucene phonetic jar and restart the server. Then, create a full text index using this analyzer:

CALL db.index.fulltext.createNodeIndex('jobs', ['Job'], ['name'], {analyzer:'phonetic'})

Querying the index looks the same:

CALL db.index.fulltext.queryNodes('jobs', 'fynancial')

score 0 · Accepted Answer · answered Apr 17 '21 at 12:44

It took a while, this is how I solved my question.

MATCH (a)-[:IS]->(hs)
UNWIND a.naam AS namelist
CALL apoc.text.phonetic(namelist) YIELD value
WITH value AS search_str, SPLIT('INPUT FROM DATABASE', ' ') AS input, a
CALL apoc.text.phonetic(input) YIELD value
WITH value AS match_str, search_str, a
WHERE search_str CONTAINS match_str OR search_str = match_str
RETURN DISTINCT a.naam, label(a)

Can Cypher do phonetic text search with only a part of the text, without using elastic search?

2 Answers2