Checking if a string is a prefix of any URL pattern

Question

Inadvisable or not, my Django site assigns each user a page at the root, e.g., /rgov.

I use a character set whitelist, so creating index.html or something nefarious should be prevented. My URL configuration also routes user pages last, so it should not be possible to hijack /admin or anything else by registering the corresponding name.

However, I'd like to prevent users from registering admin, since their page will be broken.

(Similar question, which does not have an ideal solution, as the following part describes.)

Here is my attempt:

def is_reserved(username):
  r = urls.resolvers.get_resolver('mysite.systemurls')
  hit = False
  for path in ('/{}', '/{}/'):
    try:
      r.resolve(path.format(username))
      hit = True
      break
    except urls.exceptions.Resolver404:
      continue
  return hit

Here, the mysite.systemurls module defines every URL pattern except for the user pages.

This does prevent picking the username admin because there is a route defined for /admin/. But it does not prevent api, because while there is /api/foo/bar, there is no route for /api/.

Is there a way to test if there is a route that is a suffix of /api/ (for example)? Since URL patterns are regular expressions, maybe it's not so easy, but in a theoretical sense it should be possible.

score 0 · Accepted Answer · answered May 31 '18 at 10:40

Here's my inelegant solution, sorry for the eyesore. I implemented a check using the Django system checks framework. The check collects all of the URL patterns used by the app and then extracts the first path component from each one. It then makes sure that none of those first path components

If you have some re_path in your URL patterns that breaks the assumptions being made, this will not work.

import re

from django import urls
from django.core.checks import register, Error, Tags, Warning

from . import usernames


@register(Tags.urls, Tags.security)
def check_scary_available_usernames(app_configs=None, **kwargs):
  '''
  Checks that there are no URL patterns /x/y where /x itself is not a pattern.
  In this case, /x might be available for user registration, which would be bad.
  '''
  errors, prefixes = [], set()
  r = urls.resolvers.get_resolver('mysite.systemurls')
  descend_into_resolver(r, [], errors, prefixes)

  # Check to make sure none of these usernames is taken or available
  for prefix in prefixes:
    if not prefix:
      continue
    if not usernames.is_reserved(prefix):
      errors.append(Warning(
        'There is no restriction on registering the forbidden username {}, '
        'which would conflict with a URL in use by the system.'.format(prefix)
      ))
    if usernames.user_exists(prefix):
      errors.append(Warning(
        'A user has the forbidden username {}, which conflicts with a URL in '
        'use by the system.'.format(prefix)
      ))
  return errors


def descend_into_resolver(resolver, chain, errors, prefixes):
  for up in resolver.url_patterns:
    regex = up.pattern.regex.pattern
    if isinstance(up, urls.resolvers.URLResolver):
      descend_into_resolver(up, chain + [regex], errors, prefixes)
    elif isinstance(up, urls.resolvers.URLPattern):
      collect_pattern_prefix(chain + [regex], errors, prefixes)
    else:
      errors.append(Warning(
        'Resolver has unexpected URL pattern: {}'.format(repr(up))
      ))


def collect_pattern_prefix(patterns, errors, prefixes):
  # Remember, we are matching against a regular expression pattern! We are not
  # taking a robust approach; if it fails, this could report spurious warnings.
  uberpattern = r''
  for i, pattern in enumerate(patterns):
    if i != 0:
      pattern = pattern.lstrip('^')
    if i != len(patterns) - 1:
      pattern = pattern.rstrip('$')
    uberpattern += pattern
  uberpattern = uberpattern.replace('\\/', '/')

  m = re.match(r'^\^?([^/$]*)', uberpattern)
  if m is None:
    errors.append(Warning(
      'Could not determine first component of URL pattern '
      '{}'.format(uberpattern)
    ))
  else:
    prefixes.add(m.group(1))

Checking if a string is a prefix of any URL pattern

1 Answers1