Questions tagged [urlparse]

urlparse is used for parsing a URL into components like (addressing scheme, network location, path etc.)

urlparse is module in Python2.7 and renamed to urllib.parse in Python 3

Links:

urlparse

urllib.parse

196 questions
1
vote
1 answer

Obfuscate password in url

I want to obscure a password in a URL for logging purposes. I was hoping to use urlparse, by parsing, replacing password with dummy password, and unparsing, but this is giving me: >>> from urllib.parse import urlparse >>> parts =…
blueFast
  • 41,341
  • 63
  • 198
  • 344
1
vote
0 answers

python's urlparse.urljoin removes path when concatenating

Documentation of urlparse.urljoin, mentions that: "Informally, this uses components of the base URL, in particular the addressing scheme, the network location and (part of) the path" Thus I expect that: urljoin('http://localhost:3000/foo',…
stelios
  • 2,679
  • 5
  • 31
  • 41
1
vote
1 answer

how to extract a headline form a url?

I have a dataset of headlines, such…
ℕʘʘḆḽḘ
  • 18,566
  • 34
  • 128
  • 235
1
vote
2 answers

Split list returned from urlparse in python

I am able to parse url using urlsplit and get parameters using query argument. url is '/api/v1/test?par1=val1&par2=val2a%3D1%26val2b%3Dfoo%26val2c%3Dbar' After using urlsplit and query I get 'par1=val1&par2=val2a%3D1%26val2b%3Dfoo%26val2c%3Dbar' And…
user2661518
  • 2,677
  • 9
  • 42
  • 79
1
vote
1 answer

Unicode representation of an object back into an object (in python)

FYI - this is program uses Django but I am NOT tagging it as such because it is not a django problem. The django code is here for context ~~The Background~~ I uncovered a bug that I had in a program. In short, I am using urlparse.urlparse to get…
Adam Hopkins
  • 6,837
  • 6
  • 32
  • 52
1
vote
5 answers

Split the title part of the URL into a separate column - Python

Suppose I have a URL as follows: http://sitename.com/pathname?title=moviename&url=VIDEO_URL I want to parse this URL to get the title part and url part alone separately. I tried the following, from urlparse import urlparse q =…
haimen
  • 1,985
  • 7
  • 30
  • 53
1
vote
4 answers

Get Host name from given URL

how to get host name from bellow Example. I/P: https://stackoverflow.com/users/login | O/P: stackoverflow.com I/P: stackoverflow.com/users/login | O/P: stackoverflow.com I/P: /users/login | O/P: (return empty string) I checked parse_url function,…
San Ka Ran
  • 156
  • 2
  • 18
1
vote
3 answers

Extracting URL parameters into Pandas DataFrame

There is a list containing URL adresses with parameters: http://example.com/?param1=apple¶m2=tomato¶m3=carrot http://sample.com/?param1=banana¶m3=potato¶m4=berry http://example.org/?param2=apple¶m3=tomato¶m4=carrot Each URL…
chilliq
  • 1,212
  • 3
  • 13
  • 32
1
vote
0 answers

How to get URL from Crawled instead of Scraped from in Portia spider deployment?

I am deploying a Portia spider in scrapyd. While deploying I am passing URLs for every link parsing Example: The URL(say URL_1) crawled by the spider is http://www.example.com/query1 and the URL(say URL_2) I am passing is…
Prabhakar
  • 1,138
  • 2
  • 14
  • 30
1
vote
1 answer

Convert tuple to string after parsing html file

I need to save parsing results in a text file. import urllib from bs4 import BeautifulSoup import urlparse path = 'A html file saved on desktop' f = open(path,"r") if f.mode == 'r': contents = f.read() soup =…
Sam Lowry
  • 21
  • 3
1
vote
1 answer

Parsing a url link for a tag from a list of url links parsed from a saved html file. And saving it all in a csv ouput

How can I make a smooth transition from the Part 1 to Part 2 and to save the results in Part3? So far, I have not been able to parse a scraped url link unless i inserted it into Part 2 myself. Besides, I could not save the output results as the last…
Sam Lowry
  • 21
  • 3
1
vote
1 answer

regular expression for filtrating a url with query strings / parameters in python

i have a code which loops through list of urls to do some operations but the entered urls must each contain query string , i want to check first if the url is correct and in fact contains query strings , i searched and most of the regular…
1
vote
1 answer

find network location from URL elegantly

code: import urlparse url1 = 'http://try.github.io//levels/1/challenges/1' netloc1 = urlparse.urlparse(url1)[1] #try.github.io url2 = 'https://github.com/explore' netloc2 = urlparse.urlparse(url2)[1] #github.com netloc2 is I want,however,I hope…
liuzhijun
  • 4,329
  • 3
  • 23
  • 27
1
vote
3 answers

Fetch a particular part of the url in python

I am using python and trying to fetch a particular part of the url as below from urlparse import urlparse as ue url = "https://www.google.co.in" img_url = ue(url).hostname Result www.google.co.in case1: Actually i will have a number of…
Shiva Krishna Bavandla
  • 25,548
  • 75
  • 193
  • 313
1
vote
2 answers

Odd behavior with urlparse

I was wondering if there are known workarounds to some odd behavior I'm seeing with python's urlparse. Here are some results from a couple of lines in the python interpeter: >>> import urlparse >>>…
greenhat
  • 1,061
  • 1
  • 12
  • 19