How can I improve the slow speed of full-text indexing?

Question

our version is 3.4.0. Currently, our main issue is the slow speed of building the full-text index. We have a cluster consisting of three storage nodes, and we have three listeners running on these nodes. After rebuilding, the index construction speed is extremely slow. It might only build around 5 million records in 20 hours. In our scenario, we have over 50 properties that require full-text indexing, and the number of properties in our dataset is in the billions. Could you please advise on how to speed up the process of rebuilding the full-text index? Is it possible to specify which index should be built first during the rebuild process? Are there any methods to accelerate the construction of the full-text index?

We have three physical machines. If we create more Docker containers and open multiple listeners within the containers, will it speed up the rebuild process?

Currently, if we cannot improve the speed of attribute synchronization, we are considering searching and generating Elasticsearch statements ourselves and then storing them. Could you please explain how your IDs are generated? This ID should be related to the updated data. How should we generate this ID? For example, "_id": "FIPCJILONJHKKKGBMFCKOBNNDKBFCMPLJHDMODHGMFHHAHOOPHFPHFJIKKLMEGIF".

Furthermore, during querying, the given full-text search can only return the first page of results from Elasticsearch. The example does not provide a scroll pagination feature. We can only increase the index size in Elasticsearch. Is there a method to perform paginated searches? Something similar to Elasticsearch's scroll feature.

Can I perform indexing on a specific table separately? What is the encoding rule for _id in Elasticsearch?

How can I improve the slow speed of full-text indexing?

0 Answers0