First, you could try fuzzy matching with a full text index and see if it solves the issue.
An example would be:
Set up the index-
CALL db.index.fulltext.createNodeIndex('jobs', ['Job'], ['name'], {})
Query the index with fuzzy matching (note the ~
)
CALL db.index.fulltext.queryNodes('jobs', 'fynancial~')
If you want to go further and use Lucene's phonetic searches, then you could write a little Java code to register a custom analyzer.
Include the lucene-analyzers-phonetic
dependency like so:
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers-phonetic</artifactId>
<version>8.5.1</version>
</dependency>
Then create a custom analyzer:
@ServiceProvider
public class PhoneticAnalyzer extends AnalyzerProvider {
public PhoneticAnalyzer() {
super("phonetic");
}
@Override
public Analyzer createAnalyzer() {
return new Analyzer() {
@Override
protected TokenStreamComponents createComponents(String s) {
Tokenizer tokenizer = new StandardTokenizer();
TokenStream stream = new DoubleMetaphoneFilter(tokenizer, 6, true);
return new TokenStreamComponents(tokenizer, stream);
}
};
}
}
I used the DoubleMetaphoneFilter but you can experiment with others.
Package it as a jar, and put it into Neo4j's plugin directory along with the Lucene phonetic jar and restart the server.
Then, create a full text index using this analyzer:
CALL db.index.fulltext.createNodeIndex('jobs', ['Job'], ['name'], {analyzer:'phonetic'})
Querying the index looks the same:
CALL db.index.fulltext.queryNodes('jobs', 'fynancial')