2

I'm using JSoup to parse HTML from pages I scrape. After long-term usage of the program, I kept seeing my RAM increase. When somebody using my program as well mentioned an out of memory exception I decided to see where the issue lies using VisualVM. Turns out the JSoup objects/nodes that get instantiated when parsing the HTML page I scrape never get garbage collected. So the more I parse, the more ends up in the heap.

For some reason, the JSoup nodes/keys/values etc do not get garbage collected at all. I'm looking for a solution for this obviously, how can I force a garbage collect these elements or release the parsed documents.

Thanks in advance.

I ran a test doing as much parsing as possible and as expected my RAM kept increasing non-stop. I can not find anything related to releasing the parsed documents, nodes or anything that could get rid of them.

Hasitha Jayawardana
  • 2,326
  • 4
  • 18
  • 36

0 Answers0