I'm not exactly sure what these modules are used for. I get that they split the respective url into its components, but why would that be useful, or what is an example of when to use urlparse?
-
This question seems too broad to me. – Panther May 07 '15 at 03:15
-
Why would it be useful to split a url into its components? – Aran Freel May 07 '15 at 03:16
-
It will depend on what you are trying to do. – Panther May 07 '15 at 03:17
-
Well then what is the most common use of urlparse? – Aran Freel May 07 '15 at 03:21
-
2Eh. Closing as too broad. It's only slightly more focused than "What would you use string concatenation for?". – Ignacio Vazquez-Abrams May 07 '15 at 03:22
-
1For getting query parameters , hostname etc. Yes @IgnacioVazquez-Abrams it seems too broad. OP needs to read some tutorial and learn instead of posting question here – Panther May 07 '15 at 03:25
-
Why would you NOT want to split a URL into it's components? – Lennart Regebro May 07 '15 at 06:43
1 Answers
Use urlparse
only if you need parameter. I have explained below why do you need parameter for.
urllib.parse.urlsplit(urlstring, scheme='', allow_fragments=True)
This is similar to urlparse(), but does not split the params from the URL. This should generally be used instead of urlparse() if the more recent URL syntax allowing parameters to be applied to each segment of the path portion of the URL (see RFC 2396) is wanted.
Hostname is always useful to store in variable to use it later or adding parameter, query to hostname to get the web page you want while scraping.
Regarding Parameter:
FYI: According to RFC2396, parameter in url
Extensive testing of current client applications demonstrated that the majority of deployed systems do not use the ";" character to indicate trailing parameter information, and that the presence of a semicolon in a path segment does not affect the relative parsing of that segment. Therefore, parameters have been removed as a separate component and may now appear in any path segment. Their influence has been removed from the algorithm for resolving a relative URI reference.
Parameter are useful in scraping,
e.g. if the url is http://www.example.com/products/women?color=green
When you use urlparse
, you will get parameter. Now You have to change it to men
so it will be http://www.example.com/products/men?color=green
and kids
, girl
, boy
so on.

- 1
- 1

- 5,016
- 4
- 35
- 52
-
I read over the documentation, but why would getting the host name or the params of the url be useful? What are you going to use these for? – Aran Freel May 07 '15 at 03:36
-
After parsing the url how would you go about switching the parameters to create a new url? – Aran Freel May 07 '15 at 14:36
-
1url = ''http://www.example.com/products/'' param = 'men' url + param – Vaibhav Mule May 07 '15 at 16:44
-
1@AranFreel I would suggest you to try it if you find difficulty after so many try, ask fresh question in Stack Overflow. – Vaibhav Mule May 07 '15 at 16:54
-
1Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/77267/discussion-between-vaibhav-mule-and-aran-freel). – Vaibhav Mule May 08 '15 at 03:21