Using subprocess to run HTTrack from python in Windows

Question

I'm in the process of writing a web scraping python script, and one of the things I'd like it to be able to do is have it take a snapshot of certain pages (all of the html, style sheets, and images necessary to view that particular page properly offline). Seems like HTTrack is a good way to do that, and I thought I would be able to call it from within the python script using

subprocess.call(["httrack", "http://www.example.com", "-O", "\tmp\example"])

But attempting to do this results in "FileNotFoundError: [WinError 2] The system cannot find the file specified". I've also tried giving it the full file path,

subprocess.call(["C:\Program Files\WinHTTrack\httrack.exe", "http://www.example.com", "-O", "\tmp\Example"])

but I get the error "SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape"

I think this is a problem with me not understanding subprocess correctly, since I can get HTTrack working through windows command prompt. Can anyone help me understand the correct way to use subprocess?

The `"\t"` in `"\tmp\example"` doesn't jump out at you at all? As to `\U`, it seems you're using Python 3 and aren't showing us the line with a string containing `"\U"` in position 2-3, such as `"C:\Users"`. Anyway, just use [r]aw strings to avoid this problem -- except if a path ends on a backslash, in which case use a regular string and escape each backslash with another backslash, such as `"C:\\"`. — Eryk Sun, Jan 14 '16 at 06:49

score 1 · Accepted Answer · edited May 23 '17 at 11:59

1

Resolved thanks to eryksun's comment. It wasn't a problem with the subprocess syntax at all, but rather that I wasn't being careful about escaping all of my backslashes. Pulling r in front of those strings to make them raw strings fixed up my code just fine.

edited May 23 '17 at 11:59

Community

1
1

answered Jan 14 '16 at 14:39

Empiromancer

3,778
1
22
53

2

use raw string literals, to avoid escaping backslashes: `r'c:\U'` – jfs Jan 16 '16 at 07:29
@J.F.Sebastian Yup, that's what I did :) – Empiromancer Jan 19 '16 at 17:26

Using subprocess to run HTTrack from python in Windows

1 Answers1