1

I am trying to build a Database Q/A Chatbot (or specifically a Natural Language Interface to Database if you will!). And I am having trouble extracting the entities/slots from the Natural Language Query.

Take this example, I have a table

Interns Branch Birthday Salary($/H)
A Mechanical Engineering 2000-01-20 25
B IT Engineering 1999-05-09 45
A Electrical Engineering 2000-01-20 35
C Mechanical Engineering 2002-09-13 35

Example questions that user may ask from this table,

  • What is the total salary for intern A? - Desired Entities: {Interns: A}
  • Tell me the aggregate salary for A, B and C. - Desired Entities: {Interns: [A,B,C]} #Notice how column name isn't mentioned
  • Which Interns are persuing Mechanical Engineering Branch? - Desired Entities: {Branch: Mechanical Engineering}

Question:

How to identify these Entities/Slots?

  • This answer to a similar question suggests using Rule-based recognizers. But I couldn't find how to build them.

Things I have tried:

  1. Creating a custom Named Entity Recognition model using Spacy to Identify the Interns and Branch names. This model was successfully able to identify values that were given in the Training Data but was failing to identify new values.
  2. Rules based on Part Of Speech Tagging: This approach was kinda successful but wasn't generic. This means it may not work if the same sentence is spoken in another way.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Jay Shukla
  • 454
  • 4
  • 13
  • 2
    Have you tried a [Table question answering model](https://huggingface.co/google/tapas-base-finetuned-wtq) from Huggingface? – SilentCloud Sep 27 '21 at 09:08
  • Yes. But unfortunately, it fails to answer the question due to the Large table size (~10000 x ~40). – Jay Shukla Sep 27 '21 at 09:14

0 Answers0