3

I have some links that include Persian texts, such as:

http://sample.com/fields/طب%20نظامی

And in the view function I want to access to Persian part, so:

url = request.path_info
key = re.findall('/fields/(.+)', url)[0]

But I get the following error:

IndexError at /fields/
list index out of range

Actually, the problem is with the index zero because it can not see anything there! It should be noted that it is a Django project on IIS Server and I have successfully tested it with other servers and the local server. I think it has some thing related to IIS. Moreover I have tried to slugify the url without success. I can encode urls successfully, but I think it is not the actual answer to this question.

Based on the comments: I checked the request.path too and the same problem. It contains:

/fields/

I implemented a sample django project in local server and here is my views:

def test(request):
   t = request.path
   return HttpResponse(t)

The results:

http://127.0.0.1:8000/تست/
/تست/

Without any problem.

Based on the @sytech comment, I have created a middlware.py in my app directory:

from django.core.handlers.wsgi import WSGIHandler

class SimpleMiddleware(WSGIHandler):

    def __call__(self, environ, start_response):
        print(environ['UNENCODED_URL'])
        return super().__call__(environ, start_response)

and in settings.py:

MIDDLEWARE = [
    ...
    'apps.middleware.SimpleMiddleware',
]

But I am getting the following error:

__call__() missing 1 required positional argument: 'start_response'
keramat
  • 4,328
  • 6
  • 25
  • 38
  • Are you sure it is in `path_info`? Can you share the `path`/`url` that you used here? – Willem Van Onsem Sep 24 '21 at 14:51
  • The path_info contains /fields/. – keramat Sep 24 '21 at 15:46
  • It is not there and it is my problem. Maybe my question was not appropriate, with the phrase "it can not see it". – keramat Sep 24 '21 at 15:56
  • well likely you first visit the page with `fields/` hence the error, and only later will visit `fields/some-persian-text` – Willem Van Onsem Sep 24 '21 at 15:57
  • How is it possible? because it is just a link that i produce and nothing most. Also, please note that i can access the same way with local server. – keramat Sep 24 '21 at 16:03
  • if you specify `print(re.findall('/fields/(.+)', url))` so wiithout indexing, do you see that every now and then it contains data written in Persian? – Willem Van Onsem Sep 24 '21 at 16:04
  • Please note that as i said in the first comment the path and the path_info does not contain the Persian text. The main problem. – keramat Sep 24 '21 at 16:09
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/237467/discussion-between-keramat-and-willem-van-onsem). – keramat Sep 24 '21 at 16:21
  • What version of IIS and what version of URLrewrite are you using? Are you using wfastcgi or something else to run the python app? I'm also assuming you're using Python3 and a modern version of Django, like 3.x or 4.x right? – sytech Aug 25 '22 at 00:58
  • IIS version is 10.0.20348.1. URLrewrite version is 7.2.1993. Yes is use wfastcgi. python version is 3.8.5 and Django version is 3.2.3. – keramat Aug 25 '22 at 13:54

3 Answers3

0

You can do this by using python split() method

url = "http://sample.com/fields/طب%20نظامی"
url_key = url.split(sep="/", maxsplit=4)
url_key[-1]
output : 'طب%20نظامی'

in this url is splited by / which occurs 4 time in string so it will return a list like this

['http:', '', 'sample.com', 'fields', 'طب%20نظامی']

then extract result like this url_key[-1] from url_key

Ankit Tiwari
  • 4,438
  • 4
  • 14
  • 41
0

Assuming you don't have another problem in your rewrite configuration, on IIS, depending on your rewrite configuration, you may need to access this through the UNENCODED_URL variable which will contain the unencoded value.

This can be demonstrated in a simple WSGI middleware:

from django.core.handlers.wsgi import WSGIHandler

class MyHandler(WSGIHandler):
    def __call__(self, environ, start_response):
        print(environ['UNENCODED_URL'])
        return super().__call__(environ, start_response)

You would see the unencoded URL and the path part that's in Persian would be passed %D8%B7%D8%A8%2520%D9%86%D8%B8%D8%A7%D9%85%DB%8C. Which you can then decode with urllib.parse.unquote

urllib.parse.unquote('%D8%B7%D8%A8%2520%D9%86%D8%B8%D8%A7%D9%85%DB%8C')
# طب%20نظامی

If you wanted, you could use a middleware to set this as an attribute on the request object or even override the request.path_info.

You must be using URL rewrite v7.1.1980 or higher for this to work.

You could also use the UNENCODED_URL directly in the rewrite rule, but that may result in headaches with routing.

I can encode urls successfully, but I think it is not the actual answer to this question.

Yeah, that is another option, but may result in other issues like this: IIS10 URL Rewrite 2.1 double encoding issue

sytech
  • 29,298
  • 3
  • 45
  • 86
  • I am using the default configuration of rewrite. – keramat Aug 28 '22 at 15:14
  • super().__init__(*args, **kwargs) TypeError: object.__init__() takes exactly one argument (the instance to initialize) – keramat Aug 28 '22 at 15:43
  • @keramat this is a `__call__` method, not `__init__` not sure how you've got that error from the above. Also I'm not sure what you mean by "default" configuration. What is the version of IIS and what is the version of the module? – sytech Aug 28 '22 at 16:34
  • IIS version is 10.0.20348.1. URLrewrite version is 7.2.1993. Yes is use wfastcgi. python version is 3.8.5 and Django version is 3.2.3. – keramat Aug 28 '22 at 16:38
  • I have edited the question with some new information. – keramat Aug 29 '22 at 14:45
  • @keramat I don't think you're using the handler correctly. As an alternative, you should also just be able to see the variable in `request.META` like `request.META['UNENCODED_URL']` -- your IIS site configuration might need to allow this variable to be passed to the application (I don't remember if it's available by default). – sytech Aug 29 '22 at 21:28
  • thanks for your reply. What is wrong with the handler usage? Also, please note that I already tried request.META['UNENCODED_URL'] which returned empty again. I will try to find the IIS confing you mentioned. – keramat Aug 30 '22 at 05:38
0

you can Split the URL by :

string = http://sample.com/fields/طب%20نظامی
last_part = string. Split("/")[-1]
print(last_part)
output :< طب%20نظامی >


slugify(last_part)

or

slugify(last_part, allow_unicode=True)

I guess This Will Help You :)