15

What is the standard practice in Python when I have a command-line application taking one argument which is

URL to a web page

or

path to a HTML file somewhere on disk

(only one)

is sufficient the code?

if "http://" in sys.argv[1]:
  print "URL"
else:
  print "path to file"
Constantinius
  • 34,183
  • 8
  • 77
  • 85
xralf
  • 3,312
  • 45
  • 129
  • 200

3 Answers3

22
import urlparse

def is_url(url):
    return urlparse.urlparse(url).scheme != ""
is_url(sys.argv[1])
jassinm
  • 7,323
  • 3
  • 33
  • 42
3

Depends on what the program must do. If it just prints whether it got a URL, sys.argv[1].startswith('http://') might do. If you must actually use the URL for something useful, do

from urllib2 import urlopen

try:
    f = urlopen(sys.argv[1])
except ValueError:  # invalid URL
    f = open(sys.argv[1])
Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • The `open() ` throws exception as well. – rplnt Oct 21 '11 at 13:33
  • Don't forget `except IndexError:` as the user might not specify an argument, which will throw an index error. Or am I wrong? – Griffin Oct 21 '11 at 13:55
  • @Griffin: I've considered that a separate problem for the purpose of this answer. – Fred Foo Oct 21 '11 at 15:04
  • @rplnt: yes, and the OP might or might not want to check for `IOError`. I'm just showing how `urlopen` and `open` may be combined, not how to tackle the larger problem. This snippet is enough for writing a generic `open_url_or_file` function that simply re-raises what it gets from `open`. – Fred Foo Oct 21 '11 at 15:05
  • @larsmans That may be, but from the looks of it the OP doesn't know how to use exception handlers. I don't see any reason not to include it since it won't work if an argument isn't specified. – Griffin Oct 21 '11 at 15:06
  • @FredFoo's implementation is the most correct exception handling. Only handle the exceptions you know how to handle, otherwise let the caller handle exceptions. In this case, if there's a file open or read or permissions error, etc. Let the caller know rather than catching and hiding the exception – xaviersjs Aug 27 '18 at 19:09
  • Note that if argument is url with 404 error, then the code slows down. – Chris P May 02 '20 at 09:05
1

Larsmans might work, but it doesn't check whether the user actually specified an argument or not.

import urllib
import sys

try:
    arg = sys.argv[1]
except IndexError:
    print "Usage: "+sys.argv[0]+" file/URL"
    sys.exit(1)

try:
    site = urllib.urlopen(arg)
except ValueError:
    file = open(arg)
Griffin
  • 644
  • 6
  • 18