I am making a search engine for full text search, and i have a problem in performance when displaying results with description. I made the results for the current query, but the lack of performance is when i try to get the text and highlight the part where the keyword is. I use pdf, txt, doc, docs, html and etc. So my search engine works like:
- I have a db table where i store the document text
- I have a db table where i index the text with it's frequency
Is this scenario good at all. I have to search the index and get the document, parse the text, get the sentences, filter the sentences with the keyword. The performance for searching without description is:
**Крушевското Востание 1903** 0,00518989562988
**Даме Груев** 0,00394678115845
**Даме Груев и Гоце Делчев** 0,0916090011597
**Државен празник Илинден** 0,0072648525238
**Даме** 0,00195503234863
**Александар Македонски** 0,0423209667206
**Бранко Црвенковски и Никола Груевски** 0,0233609676361
**СДСМ и ВМРО-ДПМНЕ** 0,0295231342316
**Македонија** 0,0435738563538
**Никола Груевски и Македонија** 0,0451180934906
The search keywords are in my native language, the collection of documents is 3679. With a description tag of the sentences i have 10x-20x times slower displaying of results. (like 2-3 seconds). The search is made in python.
Any suggestion for it?