0

I need to add a scheme to urls which don't have it. I want to use the following code:

from urllib.parse import urlparse, urlunparse

url = 'example.com'
parsed = urlparse(url)
parsed = parsed._replace(scheme='https')
new_url = urlunparse(parsed)
print(new_url)

Instead of this:

https://example.com

the script is returning this:

https:///example.com

which throws and error if I try to get the url like so:

requests.get('https:///example.com')

Why is this happening and what can I do about it?

I am using:

Windows 10
Python 3.6.1
Anaconda 4.4.0
Brian
  • 431
  • 8
  • 18

1 Answers1

1

The initial string is parsed as a path component because there is no scheme to indicate that the string is a host:

urlparse('example.com')
# ParseResult(scheme='', netloc='', path='example.com', params='', query='', fragment='')

Perhaps add a scheme to make this explicit:

urlparse('http://example.com')
# ParseResult(scheme='http', netloc='example.com', path='', params='', query='', fragment='')
jspcal
  • 50,847
  • 7
  • 72
  • 76