How to tell if two natural language queries have the same meaning

Question

I am building a system to change natural language questions into SQL queries. Right now what I am implementing is a refactoring of a natural language question to be more structured so that I will have an easier time converting it into a sql statement.

The restructured language will follow these rules:

what they want to do ex. "Find" "List" "Give" attributes they want us to retrieve ex. Table attributes from sql schema entities that they want us to match on

This refactored language is great and can easily be transformed into SQL, but the problem is that I am creating a large combination of all the noun chunks and entities which means lots of sentences. Future development will help minimize these but that is for later.

So from the large amount of sentences I need to find which one is most similar to the original query.

So my question is, what kind of similarity functions would you recommend? ex. parse tree structure, semantic and syntactic similarity...

Thanks for the help, I am building this for open-source so any help is going to a good cause

These queries are going to be run with read-only permissions, right? If so, you might want to clearly specify that to prevent database security people from showing up at your door with pitchforks and torches. — Ray, Apr 12 '19 at 17:12

score 0 · Answer 1 · answered Apr 12 '19 at 15:56

Have you tried spaCy's .similarity method? You can use spaCy's pipeline to grab the nlp objects for all of the queries very quickly. You would then do something along the lines of nlp_original_query.similarity(nlp_other_query). I have had a lot of success using this to compare the similarity of queries/keywords.

How to tell if two natural language queries have the same meaning

1 Answers1