7

My requirement is to download an abc.zip file from some website http://clientdownload.xyz.com/Documents/abc.zip

For this activity I have written a python script as follows:

    url_to_check = 'http://clientdownload.xyz.com/Documents/abc.zip'
    username = "user"
    password = "pwd"
    p = urllib2.HTTPPasswordMgrWithDefaultRealm()
    p.add_password(None, url_to_check, username, password)
    handler = urllib2.HTTPBasicAuthHandler(p)
    opener = urllib2.build_opener(handler)
    urllib2.install_opener(opener)
    zip_file = urllib2.urlopen(url_to_check).read()       
    file_name = 'somefile.zip'
    meta = zip_file.info()
    file_size = int(meta.getheaders("Content-Length")[0])
    print "Downloading: %s Bytes: %s" % (file_name, file_size)

    with open(file_name, 'wb') as dwn_file:
        dwn_file.write(zip_file.read())

Whereas I am getting the following errors when I run the script:

File "updateCheck.py", line 68, in check_update zip_file = urllib2.urlopen(url_to_check).read() File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen return _opener.open(url, data, timeout) File "/usr/lib/python2.7/urllib2.py", line 406, in open response = meth(req, response) File "/usr/lib/python2.7/urllib2.py", line 519, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib/python2.7/urllib2.py", line 444, in error return self._call_chain(*args) File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain result = func(*args) File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) urllib2.HTTPError: HTTP Error 401: Unauthorized

I have given the user name and password properly but it throws unauthorized error.

When I tried to download it using wget link with -http-user and --ask-password options, I am able to download the file.

Also using the same script I am able to download files from other servers properly.

I ran this script to get more info:

import urllib2, re, time, sys

theurl='http://clientdownload.xxx.com/Documents/Forms/AllItems.aspx'

req = urllib2.Request(theurl)

try:
    handle = urllib2.urlopen(req)

except IOError, e:

    if hasattr(e, 'code'):

        if e.code != 401:
            print 'We got another error'
            print e.code
        else:
            print e.headers
            print e.headers['www-authenticate']

I got the following information:

Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/7.5
SPRequestGuid: 939bad00-40b7-49b9-bbbc-99d0267a1004
X-SharePointHealthScore: 0
WWW-Authenticate: NTLM
X-Powered-By: ASP.NET
MicrosoftSharePointTeamServices: 14.0.0.6029
Date: Wed, 12 Feb 2014 13:14:19 GMT
Connection: close
Content-Length: 16

NTLM

Vogel612
  • 5,620
  • 5
  • 48
  • 73
user3301805
  • 71
  • 1
  • 2
  • 4
  • If I understand correctly, you're using a basic auth handler with a NTLM authentication. Try with something [like this](http://code.google.com/p/python-ntlm/). – Laur Ivan Feb 12 '14 at 13:41
  • Yeah, Already tried using the Ntlm Auth Handler whereas my python installtion doesnt have NTLM package so I got the following error. ImportError: cannot import name HTTPNtlmAuthHandler – user3301805 Feb 12 '14 at 13:57
  • Well, you could install the package or use a [virtual environment](http://www.virtualenv.org/en/latest/). virtualenv is part of python best practices (afaik) and allows you to install custom stuff without messing your original python installation. – Laur Ivan Feb 12 '14 at 14:01
  • Have you ever got a solution to this? I am also seeing similar import errors. Could not find a version that satisfies the requirement HTTPNtlmAuthHandler (from versions: ) No matching distribution found for HTTPNtlmAuthHandler Python 2.7.11 :: Anaconda 4.0.0 (64-bit) – YouHaveaBigEgo Jul 15 '16 at 02:08

1 Answers1

0

You could consider using requests to make it easier to interact via HTTP. In your case by installing requests-ntlm (a plugin for requests) you will get NTLM authentication in a more transparent way:

import requests
from requests_ntlm import HttpNtlmAuth

r = requests.get("http://ntlm_protected_site.com",auth=HttpNtlmAuth('domain\\username','password'))

r holds the response, including error codes and headers (specifically for your case r.headers.get('Content-Length')[0])

WoJ
  • 27,165
  • 48
  • 180
  • 345