I need to construct inverted index for a text corpus containing multiple languages. I have tokenized the corpus into words in advance according to certain rules. But after looking through the weaviate documentation, I'm not sure weaviate is able to properly support this requirement.
I plan to use data type string
to build inverted index of words in different languages by adding Spaces to the word segmentation. I want to ask, does this work? I'm also going to spend some time trying it out and see what happens