6

How could I easily extract hostname from a git URL like ssh://git@gitlab.org.net:3333/org/repo.git

u = urlparse(s)

gives me

ParseResult(scheme='ssh', netloc='git@gitlab.org.net:3333', path='/org/repo.git', params='', query='', fragment='')

which means that netloc is closest to what I want and this leaves a disappointing amount of work to me.

Should I do

u.netloc.split('@')[1].split(':')[0]

or is there a library that handles it better?

d33tah
  • 10,999
  • 13
  • 68
  • 158

2 Answers2

9

The returned ParseResult has a hostname attribute:

>>> urlparse('ssh://git@gitlab.org.net:3333/org/repo.git').hostname
'gitlab.org.net'
idjaw
  • 25,487
  • 7
  • 64
  • 83
Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • For some reason, the docs for ParseResult do not contain information about ParseResult.hostname . However, https://docs.python.org/2.7/library/urlparse.html#module-urlparse does. – cowlinator Oct 20 '17 at 20:50
1

Using the standard lib urlparse will fail to parse many valid git URLs.

>>> from urllib.parse import urlparse
>>> urlparse('git@github.com:Org/Private-repo.git')
ParseResult(scheme='', netloc='', path='git@github.com:Org/Private-repo.git', params='', query='', fragment='')

https://pypi.python.org/pypi/git-url-parse is a fairly good parser of git URLs with a similar interface to urlparse.

>>> import giturlparse
>>> url = giturlparse.parse('ssh://git@gitlab.com:3333/org/repo.git')
>>> url
Parsed(pathname='/org/repo.git', protocols=['ssh'], protocol='ssh', href='ssh://git@gitlab.com:3333/org/repo.git', resource='gitlab.com', user='git', port='3333', name='repo', owner='org')
>>> url.resource
'gitlab.com'

https://pypi.org/project/giturlparse/ is another one, which is more recently updated, and uses a similar API.

Note both of those PyPI packages install to directory giturlparse, so they conflict with each other, but they due to having a similar API they are almost interchangable.

John Vandenberg
  • 474
  • 6
  • 16