Bert fine-tuned for semantic similarity

Question

I would like to apply fine-tuning Bert to calculate semantic similarity between sentences. I search a lot websites, but I almost not found downstream about this.

I just found STS benchmark. I wonder if I can use STS benchmark dataset to train a fine-tuning bert model, and apply it to my task. Is it reasonable?

As I know, there are a lot method to calculate similarity including cosine similarity, pearson correlation, manhattan distance, etc. How choose for semantic similarity?

check this package https://pypi.org/project/similar-sentences/ — Shankar Ganesh Jayaraman, Apr 16 '20 at 12:18

score 2 · Answer 1 · answered Dec 04 '19 at 14:24

2

In addition, if you're after a binary verdict (yes/no for 'semantically similar'), BERT was actually benchmarked on this task, using the MRPC (Microsoft Research Paraphrase Corpus). The google github repo https://github.com/google-research/bert includes some example calls for this, see --task_name=MRPC in section Sentence (and sentence-pair) classification tasks.

answered Dec 04 '19 at 14:24

Igor

1,251
10
21

HuggingFace has example and fine-tuned BERT on MRPC here - https://github.com/huggingface/transformers/tree/master/examples – Adnan S Dec 05 '19 at 01:21
Thank for your advice. I know that, but my mission isn't binary. My mission is that I have 100,00 questions and 300 different items' description. I would like to match a item for every question. – Chad Dec 05 '19 at 05:57

dennlinger · Accepted Answer · 2019-12-04T10:01:18.940

As a general remark ahead, I want to stress that this kind of question might not be considered on-topic on Stackoverflow, see How to ask. There are, however, related sites that might be better for these kinds of questions (no code, theoretical PoV), namely AI Stackexchange, or Cross Validated.

If you look at a rather popular paper in the field by Mueller and Thyagarajan, which is concerned with learning sentence similarity on LSTMs, they use a closely related dataset (the SICK dataset), which is also hosted by the SemEval competition, and ran alongside the STS benchmark in 2014.

Either one of those should be a reasonable set to fine-tune on, but STS has run over multiple years, so the amount of available training data might be larger.

As a great primer on the topic, I can also highly recommend the Medium article by Adrien Sieg (see here, which comes with an accompanied GitHub reference.

For semantic similarity, I would estimate that you are better of with fine-tuning (or training) a neural network, as most classical similarity measures you mentioned have a more prominent focus on the token similarity (and thus, syntactic similarity, although not even that necessarily). Semantic meaning, on the other hand, can sometimes differ wildly on a single word (maybe a negation, or the swapped sentence position of two words), which is difficult to interpret or evaluate with static methods.

Thank for your help. I am not familiar with this platform. I will check "ask" rules again. — Chad, Dec 04 '19 at 09:58
My bad, formatting got caught up with me here. I'll extend the formatting ASAP — dennlinger, Dec 04 '19 at 10:00
check this dataset paper, a tuned Bert for semantic similarity arxiv.org/abs/2004.10349 — Fuji, Apr 29 '20 at 17:59

Bert fine-tuned for semantic similarity

2 Answers2

Linked