We want create a multi-language phrasebook / dictionary for a specific area.
And now I'm thinking about the best data structure / data model for that.
Since it should be more phrasebook than dictionary we want to keep the data model / structure first simple. It should be only used for fast translation: i.e. user selects two languages, types a word and gets translation. The article and description parts are just for displaying, not for search.
There are some specific cases I'm thniking about:
- One term can be expressed with several (1..n) words in any language
- Any term can also be translated into several (1..m) words in another language
- In some languages the word's articel could be important to know
- For some words description could be important (e.g. for words from dialects etc.)
I'm not sure about one point: do I reinvent the wheel creating a data model by myself? But I couldn't find any solutions.
I've just created a json data model I'm not sure about if it good enough or not:
[
{
wordgroup-id: 1,
en: [
{word: 'car', plural: 'cars'},
{word: 'auto', plural: 'autos'},
{word: 'vehicle', plural: 'vehicles'},
],
de: [
{word: 'Auto', article: 'das', description: 'Some explanation eg. when to use this word', plural: 'Autos'},
{word: 'Fahrzeug', article: 'das', plural: 'Fahrzeuge'}
],
ru: [...],
...
},
{
wordgroup-id: 2,
...
},
...
]
I also thought about some "corner" cases @triplee wrote about. I thought to solve them with some kind of redundance. Only the word group id and the word within a language should be unique.
I would be very thankfull for any feedback to the first draft of the data model.