0

In information retrieval or question answering system, we use TD-IDF or BM25 to compute the similarity score of question-question pair as the baseline or coarse ranking for deep learning.

In community question answering, we already have the question-answer pairs to collect some statistics info. Without deep learning, could we invent an algorithm like BM25 to compute the relevance score of question-answer pair?

What are some ways to do it?

Tiago Duque
  • 1,956
  • 1
  • 12
  • 31
DunkOnly
  • 1,682
  • 4
  • 17
  • 39

1 Answers1

1

Without deep learning, could we invent an algorithm like BM25 to compute the relevance score of question-answer pair?

Yes, there are many ways to do it. To make your question a little more directed, let's answer "Which are the possible ways to compute the relevance of question-answer pair without using question answering?"

Some examples and explanations:

  • TF-IDF [that you mentioned] is actually a feature extraction technique. With it, you retrieve which words from the context are present/important for each document - with this, you can compare two similarly worded (that's what BM25 does).

  • Another technique is to use PageRank, which is the algorithm used by Google. You can actually attempt to replicate it, since it is not too complex.

  • One other way is to use graphs to do it. I did it in my Masters research and you can read my dissertation here.

Aside from that, I'd advise you to check on this papers for other examples of Question-Answering (you can get to question-answer matching easily if you understand the concepts): https://www.sciencedirect.com/science/article/pii/S0020025511003860 and https://www.sciencedirect.com/science/article/pii/S1319157815000890?via%3Dihub.

Also, keep checking ACL State of the Art Question Answering Techniques for the most updated results and techniques.

Tiago Duque
  • 1,956
  • 1
  • 12
  • 31