5

I've created a web scraper with python (3.6) and a selenium, firefox web driver. I've set up a cronjob to run this scraper every few minutes, and it seems to all be working, except that over time (like a few days), the memory on my Ubuntu VPS (8GB RAM, Ubuntu 18.04.4) fills up and it crashes.

When I check HTOP, I can see lots (as in, hundreds) of firefox processes like "/usr/lib/firefox -marionette" and "/usr/lib/firefox -contentproc", all taking up about 3 or 4mb of memory each.

I've put a

browser.stop_client() browser.close() browser.quit()

In every function that uses the web driver, but I suspect the script is sometimes leaving web drivers open when it hits an error, and not closing them properly, and these firefox processes just accumulate until my system crashes.

I'm working on finding the root cause of this, but in the meantime, is there a quick way I can kill/clean up all these processes?

e.g. a cronjob that kills all matching processes (older than 10 minutes)?

Thanks.

aforbes
  • 143
  • 8

1 Answers1

3

I suspect the script is sometimes leaving web drivers open when it hits an error, and not closing them properly

This is most likely the issue. We fix this issue by using try except finally blocks.

browser = webdriver.Firefox()
try:
    # Your code
except Exception as e:
    # Log or print error
finally:
    browser.close()
    browser.quit()

And if you still face the same issue, you can force kill the driver as per this answer, or this answer for Ubuntu.

import os
os.system("taskkill /im geckodriver.exe /f")
Naveen
  • 770
  • 10
  • 22
  • Thanks for your effort. But, OPs question is specific to ubuntu and AFAIK `taskkill` does not work on `ubuntu`. Check my comment on the original question. – supputuri May 06 '20 at 04:23
  • Thanks for that, I'll try to incorporate this try/except/finally structure, and check those links out. – aforbes May 07 '20 at 05:12
  • 2
    Good answer, but there's no need to call `browser.close()`. `browser.quit()` already closes all browser windows. – Zuku Aug 14 '21 at 22:44