0

I have a set of synonyms for example like this:

big large large huge small little apple banana

Meaning big is a synonym for large, large is synonym for huge, small for little, apple for banana and vice versa(large is synonym for big, etc). Another thing is "big" is a synonym for "huge" and "huge" is a synonym for "big" because of indirect relationship via "large".

This should be something like thesaurus? But I'm not sure how the data structure should look.

choki1708
  • 33
  • 1
  • 8

2 Answers2

1

"Many different aspects of language have a natural representation as graphs. Graphs can also be used to describe how words relate to one another semantically. Within each word class, words are grouped into sets of synonyms, so-called synsets." - according to this article.

So, for an example synset for word 'banana' is (elongated crescent-shaped yellow fruit with soft sweet flesh) according to WordNet. Synsets are linked to one another by semantic relationships. So, you can find simmilar semantic synset for word 'apple' (fruit with red or yellow or green skin and sweet to tart crisp whitish flesh).

You can use this ruby gem to build a graph using WordNet database.

steimo
  • 170
  • 1
  • 2
  • 10
1

One simple option would be an array of arrays like:

[
  ['big', 'large', 'huge'],
  ['small', 'little']
]

Alternately if e.g. huge is not a synonym of big in your model then you might want a hash like:

{
  big: ['large'],
  large: ['big', 'huge'],
  huge: ['large'],
  small: ['little', 'tiny'],
  little: ['small'],
  ...
}

It really depends on what you plan do do with it.

olleicua
  • 2,039
  • 2
  • 21
  • 33
  • Ye I tried doing the first solution, which is the correct one. But doing a union between them is giving me some problems. I end up with something like: ```[ ['big', 'large', 'huge'], ['small', 'little'], ['something', 'huge'], ['something_else', 'large'], ]``` – choki1708 Feb 14 '22 at 15:58
  • Problem is that the pair between some synonyms might be established later, based on 3rd or 4th synonym pair, so I would have to union existing arrays – choki1708 Feb 14 '22 at 16:05
  • I'm not sure what you mean by "doing a union" in this context – olleicua Feb 15 '22 at 16:35
  • Yup, that is exactly what I did in the end. I got exactly that. – choki1708 Feb 24 '22 at 07:58
  • I needed to create a thesaurus and figure out if 2 words are synonyms. I created graph like this and then just used DFS to go over it. – choki1708 Feb 24 '22 at 08:00