4

I have a bunch of html pages. The idea is to allow users enter keywords to search through these html pages. only html pages that match the criteria will be store for later references. I knew that Elastic search can index html,pdf, and more but in my case I already have postgresql as my database and my system is small enough so I don't want to have Elasticsearch as extra dependency for this project.

A few issues I have here here:

  • because html won't be stored unless the query match the users' keywords, is there a better approach to handle this without having had to index html first to the search engine to be able to search and remove it afterward if it doesn't match the criteria ?

  • yes is it possible to index whole html content like in Elasticsearch ?

Thanks a lot for your help?

channa ly
  • 9,479
  • 14
  • 53
  • 86

0 Answers0