4

I'm trying to build an URL-alias app which allows the user create aliases for existing url in his website.

I'm trying to do this via middleware, where the request.META['PATH_INFO'] is checked against database records of aliases:

try:
    src: request.META['PATH_INFO']
    alias = Alias.objects.get(src=src)
    view = get_view_for_this_path(request)

    return view(request) 
except Alias.DoesNotExist:
   pass

return None

However, for this to work correctly it is of life-importance that (at least) the PATH_INFO is changed to the destination path.

Now there are some snippets which allow the developer to create testing request objects (http://djangosnippets.org/snippets/963/, http://djangosnippets.org/snippets/2231/), but these state that they are intended for testing purposes.

Of course, it could be that these snippets are fit for usage in a live enviroment, but my knowledge about Django request processing is too undeveloped to assess this.

skaffman
  • 398,947
  • 96
  • 818
  • 769
Izz ad-Din Ruhulessin
  • 5,955
  • 7
  • 28
  • 28
  • 1
    redirects are better not only because they make it easier to maintain code, but also - they allow you have unique url for each page. That's probably better for your search rankings. – Evgeny Nov 22 '10 at 18:10
  • Can this be circumvented by blocking robots from directly accessing aliased urls? – Izz ad-Din Ruhulessin Nov 24 '10 at 13:47

3 Answers3

4

Instead of the approach you're taking, have you considered the Redirects app?

It won't invisibly alias the path /foo/ to return the view bar(), but it will redirect /foo/ to /bar/

Steve Jalim
  • 11,989
  • 1
  • 37
  • 54
  • Thanks for the suggestion stevejalim. However, the invisible aliasing was just what I was aiming for. Concerning the request, I mistakenly assumed that the request object is immutable (because QueryDicts are). It is possible to add en modify attributes of the request, including request.path, request.path_info etc. Haven't tested my expirimental code yet, but will post the results as an answer when finished. – Izz ad-Din Ruhulessin Nov 22 '10 at 23:06
  • Am glad you've found a solution you're happy with, Izz, but my gut is that you'll end up with more pain by changing the request object's path as you go - not least in debugging/maintenance. Is it possible to just set new paths for the existing Django URLs (possibly with new, custom urlconfs for the contrib apps, if needed?) – Steve Jalim Nov 23 '10 at 10:17
2

(posted as answer because comments do not seem to support linebreaks or other markup)

Thank for the advice, I have the same feeling regarding modifying request attributes. There must be a reason that the Django manual states that they should be considered read only.

I came up with this middleware:

def process_request(self, request):
    try:
        obj = A.objects.get(src=request.path_info.rstrip('/')) #The alias record.
        view, args, kwargs = resolve_to_func(obj.dst + '/') #Modified http://djangosnippets.org/snippets/2262/
        request.path = request.path.replace(request.path_info, obj.dst)
        request.path_info = obj.dst
        request.META['PATH_INFO'] = obj.dst
        request.META['ROUTED_FROM'] = obj.src
        request.is_routed = True

        return view(request, *args, **kwargs)
    except A.DoesNotExist: #No alias for this path
        request.is_routed = False
    except TypeError: #View does not exist.
        pass

    return None

But, considering the objections against modifying the requests' attributes, wouldn't it be a better solution to just skip that part, and only add the is_routed and ROUTED_TO (instead of routed from) parts?

Code that relies on the original path could then use that key from META.

Doing this using URLConfs is not possible, because this aliasing is aimed at enabling the end-user to configure his own URLs, with the assumption that the end-user has no access to the codebase or does not know how to write his own URLConf.

Though it would be possible to write a function that converts a user-readable-editable file (XML for example) to valid Django urls, it feels that using database records allows a more dynamic generation of aliases (other objects defining their own aliases).

Izz ad-Din Ruhulessin
  • 5,955
  • 7
  • 28
  • 28
1

Sorry to necro-post, but I just found this thread while searching for answers. My solution seems simpler. Maybe a) I'm depending on newer django features or b) I'm missing a pitfall.

I encountered this because there is a bot named "Mediapartners-Google" which is asking for pages with url parameters still encoded as from a naive scrape (or double-encoded depending on how you look at it.) i.e. I have 404s in my log from it that look like:

1.2.3.4 - - [12/Nov/2012:21:23:11 -0800] "GET /article/my-slug-name%3Fpage%3D2 HTTP/1.1" 1209 404 "-" "Mediapartners-Google

Normally I'd just ignore a broken bot, but this one I want to appease because it ought to better target our ads (It's google adsense's bot) resulting in better revenue - if it can see our content. Rumor is it doesn't follow redirects so I wanted to find a solution similar to the original Q. I do not want regular clients accessing pages by these broken urls, so I detect the user-agent. Other applications probably won't do that.

I agree a redirect would normally be the right answer.

My (complete?) solution:

from django.http import QueryDict
from django.core.urlresolvers import NoReverseMatch, resolve

class MediapartnersPatch(object):
    def process_request(self, request):
        # short-circuit asap
        if request.META['HTTP_USER_AGENT'] != 'Mediapartners-Google':
            return None

        idx = request.path.find('?')
        if idx == -1:
            return None

        oldpath = request.path
        newpath = oldpath[0:idx]
        try:
            url = resolve(newpath)
        except NoReverseMatch:
            return None

        request.path = newpath
        request.GET = QueryDict(oldpath[idx+1:])
        response = url.func(request, *url.args, **url.kwargs)
        response['Link'] = '<%s>; rel="canonical"' % (oldpath,)
        return response
Julian
  • 2,814
  • 21
  • 31