0

I'm interested in keeping an eye on which advertising networks are running on a variety of websites. The Ghostery browser plugin does a great job of showing me which ad networks are used on any website. For example, on StackOverflow, Ghostery says we're being monitored by DoubleClick, Google Analytics, Quantcast, and ScoreCard.

On a weekly basis, I'd like to use Selenium to automatically browse few hundred websites and save the Ghostery data associated with these websites. Using the Python bindings for Selenium, I wrote out some rough pseudocode:

import selenium.webdriver as webdriver
urls = ['www.stackoverflow.com', 'www.amazon.com', ...]
driver = webdriver.Firefox()
for url in urls:
    driver.get(url)
    # now, how do I access Ghostery's analysis of this URL?

I suppose the broader question is "from Selenium, how do I connect to other browser plugins?"


For fun, I posted an example of what Ghostery's UI looks like (which I'd like to access programmatically):

enter image description here

solvingPuzzles
  • 8,541
  • 16
  • 69
  • 112

3 Answers3

1

Selenium is used to access and interact with a browser's DOM. Selenium is not able to access a browser's controls; it is a completely inappropriate tool for what you want to accomplish.

SiKing
  • 10,003
  • 10
  • 39
  • 90
  • Ah, I hadn't thought of that--thanks for your help. Perhaps there would be an other browser automation system that could parse this data? (Or, I could just write my own code to look through the DOM for trackers.) – solvingPuzzles Jul 05 '14 at 00:47
1

In general, its not possible for Selenium to access extensions directly. If you want to do that, you will have to build a bridge.

For Ghostery specifically, what you are looking for exists as an open source project here: https://github.com/ghostery/areweprivateyet

fixanoid
  • 229
  • 4
  • 10
0

There appears to be a limited Ghostery API described at https://purplebox.ghostery.com/post/1016023438#more-1016023438

msw
  • 42,753
  • 9
  • 87
  • 112