How to use BERT for semantic similarity tasks with long texts?

Asked Apr 05 '23 at 17:42

Active Apr 05 '23 at 17:47

Viewed 66 times

I'm trying to analyze the semantic relationship between a sentence and an article text using BERT (in order to understand if they contradict, reaffirm ou doesn't have any significant relationship with each other).

However, because of BERT's max token limitation, I couldn't find any straightforward solution for handling the article text. So far, I basically found 3 approaches:

Truncate article text (only use the beginning of it, for example).
Split article text into several smaller texts, apply BERT on every one of them and them combine all the vectors (how to?).
Use a different model with a distinct attention mechanism, like Longformer.

I already tried 1 and I'm looking to improve the results with 2 or 3, but I can't find good articles or posts explaining these approaches and/or giving examples.

Could someone point me to the right direction and/or provide a simple example on them? I tried GPT with several prompts but none of its suggestions worked.

edited Apr 05 '23 at 17:47

asked Apr 05 '23 at 17:42

Neves

Option 4: apply a topic abstract/ summarization model. It offers the advantage of being able to validate whether you have a sensible abstract for the long article. – J_H Apr 05 '23 at 17:46

How to use BERT for semantic similarity tasks with long texts?

0 Answers0