3

The parse function in urllib.parse can be used to encode url components. But its behavior is different from the standard javascript encoder.

In python

>>> import urllib
>>> urllib.parse.quote('(a+b)')
... '%28a%2Bb%29'

in Javascript

>>> encodeURIComponent('(a+b)')
... "(a%2Bb)"

Why is the python function more "strict" when encoding the url component?

If I understood it right, brackets are not reserved characters in urls. So I don't understand why they are escaped in the urllib parse function.

yellowcap
  • 3,985
  • 38
  • 51

1 Answers1

8

As of RFC 3986, brackets are reserved.

By default, Python will percent-encode every character passed to quote() except for _.-/. However, quote() is tunable. If you want strict RFC 3986 behavior, set safe to '~':

urllib.parse.quote(string, safe='~')

If you want to minimally match javascript-on-your-platform's behavior that you showed (you didn't state which parts of which ECMAScript standard it conforms to):

urllib.parse.quote(string, safe='()')
cowbert
  • 3,212
  • 2
  • 25
  • 34