I have a method that scrapes a web page and saves data into a file (see below for an example code). I need to test that the resulting data is well-formed.
The problem is, the data is received from a series of calls, and further calls use the results of previous ones. What is worse, many of the calls involved are done on the same objects (a Webdriver
, a WebDriverWait
and the expected_conditions
module), with different arguments.
I see that unittest.mock.Mock
can mock the result of a simple call, or a series of simple calls, but can't see how to implement something entangled like this. The only way I see is to manually reimplement each and every call the method makes, and copy the arguments I pass in the method into those implementations so that they know what to return for each call. And do that again for every other test case. This sounds like an absolute nightmare to write and maintain: several times more code than the tests themselves and near 1:1 duplication with the code. So I refuse to proceed until someone tells me that there's a better way or proves that there is none and everyone really does it like this (which I don't believe) and e.g. rewrites all the tests each time a label on the page changes (which is an implementation detail, so normally, it shouldn't affect test code at all).
Sample code (adapted for http://example.com):
import selenium.webdriver
from selenium.webdriver.common.by import By as by
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
def dump_accreditation_data(d, w, i, path):
f = codecs.open(os.path.join(path, "%d.txt" % i), "w", encoding="utf-8")
u = u'http://example.com/%s/accreditation' % i
d.get(u)
# page load
w.until(EC.visibility_of_element_located((by.XPATH,"//p"))) #the real code has a more complex expression here with national characters
w.until_not(EC.visibility_of_element_located((by.CSS_SELECTOR, '.waiter')))
print >> f, u
# organization name
e = w.until(EC.visibility_of_element_located((
by.CSS_SELECTOR, 'h1'
)))
org_name = e.text
print >> f, org_name
del e
#etc
e = d.find_element_by_xpath(u'//a[text()="More information..."')
print >> f, e.get_attribute('href')
#How it's supposed to be used:
d = selenium.webdriver.Firefox()
w = WebDriverWait(d, 10)
dump_accreditation_data(d, w, 123, "<output_path>")