My code is as follows:
https://github.com/T145/tphroxy/blob/master/mirror.py
https://github.com/T145/tphroxy/blob/master/transform_content.py
And when going to certain websites I get errors along these lines:
Traceback (most recent call last):
File " ... /mirror.py", line 108, in fetch_and_store
response = urlfetch.fetch(mirrored_url)
File " ... /google/appengine/api/urlfetch.py", line 293, in fetch
return rpc.get_result()
File " ... /google/appengine/api/apiproxy_stub_map.py", line 613, in get_result
return self.__get_result_hook(self)
File " ... /python27_lib/versions/1/google/appengine/api/urlfetch.py", line 449, in _get_fetch_result
raise DNSLookupFailedError('DNS lookup failed for URL: ' + url)
DNSLookupFailedError: DNS lookup failed for URL: http://public/images/v6/btn_arrow_down_padded_white.png
My guess is that specific asset url patterns aren't being matched and sent through the proxy properly, i.e. transform_content
is missing a pattern. Any help to solving this problem is greatly appreciated! I'm open to using any alternative libraries if needed.
EDIT
I've added a test suite for transform_content
, and I'm certain the primary problems are with my regex expressions from its results. Run it w/ py transform_content_test.py
if you're on Windows to get the results.