5

I have a model containing ranges of IP addresses, similar to this:

class Country(db.Model):
  begin_ipnum = db.IntegerProperty()
  end_ipnum = db.IntegerProperty()

On a SQL database, I would be able to find rows which contained an IP in a certain range like this:

SELECT * FROM Country WHERE ipnum BETWEEN begin_ipnum AND end_ipnum

or this:

SELECT * FROM Country WHERE begin_ipnum < ipnum AND end_ipnum > ipnum

Sadly, GQL only allows inequality filters on one property, and doesn't support the BETWEEN syntax. How can I work around this and construct a query equivalent to these on App Engine?

Also, can a ListProperty be 'live' or does it have to be computed when the record is created?

question updated with a first stab at a solution:

So based on David's answer below and articles such as these:

http://appengine-cookbook.appspot.com/recipe/custom-model-properties-are-cute/

I'm trying to add a custom field to my model like so:

class IpRangeProperty(db.Property):
  def __init__(self, begin=None, end=None, **kwargs):
    if not isinstance(begin, db.IntegerProperty) or not isinstance(end, db.IntegerProperty):
        raise TypeError('Begin and End must be Integers.')
    self.begin = begin
    self.end = end
    super(IpRangeProperty, self).__init__(self.begin, self.end, **kwargs)

  def get_value_for_datastore(self, model_instance):
    begin = self.begin.get_value_for_datastore(model_instance)
    end = self.end.get_value_for_datastore(model_instance)
    if begin is not None and end is not None:
      return range(begin, end)

class Country(db.Model):
  begin_ipnum = db.IntegerProperty()
  end_ipnum = db.IntegerProperty()
  ip_range = IpRangeProperty(begin=begin_ipnum, end=end_ipnum)

The thinking is that after i add the custom property i can just import my dataset as is and then run queries on based on the ListProperty like so:

q = Country.gql('WHERE ip_range = :1', my_num_ipaddress)

When i try to insert new Country objects this fails though, complaning about not being able to create the name:

...
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/db/__init__.py", line 619, in _attr_name
return '_' + self.name
TypeError: cannot concatenate 'str' and 'IntegerProperty' objects

I tried defining an attr_name method for the new property or just setting self.name but that does not seem to help. Hopelessly stuck or heading in the right direction?

tijs
  • 797
  • 7
  • 24

2 Answers2

2

Short answer: Between queries aren't really supported at the moment. However, if you know a priori that your range is going to be relatively small, then you can fake it: just store a list on the entity with every number in the range. Then you can use a simple equality filter to get entities whose ranges contain a particular value. Obviously this won't work if your range is large. But here's how it would work:

class M(db.Model):
    r = db.ListProperty(int)

# create an instance of M which has a range from `begin` to `end` (inclusive)
M(r=range(begin, end+1)).put()

# query to find instances of M which contain a value `v`
q = M.gql('WHERE r = :1', v)

The better solution (eventually - for now the following only works on the development server due to a bug (see issue 798). In theory, you can work around the limitations you mentioned and perform a range query by taking advantage of how db.ListProperty is queried. The idea is to store both the start and end of your range in a list (in your case, integers representing IP addresses). Then to get entities whose ranges contain some value v (i.e., between the two values in your list), you simply perform a query with two inequality filters on the list - one to ensure that v is at least as big as the smallest element in the list, and one to ensure that v is at least as small as the biggest element in the list.

Here's a simple example of how to implement this technique:

class M(db.Model):
    r = db.ListProperty(int)

# create an instance of M which has a rnage from `begin` to `end` (inclusive)
M(r=[begin, end]).put()

# query to find instances of M which contain a value `v`
q = M.gql('WHERE r >= :1 AND r <= :1', v)
David Underhill
  • 15,896
  • 7
  • 53
  • 61
  • 1
    The bug is the way the dev_appserver performs, unfortunately: the correct behaviour is the current production behaviour. – Nick Johnson Jul 27 '10 at 08:46
  • This looks like a promising solution, nice and simple. From the docs it seems i could make a custom model property that reports the list value 'live' based on the values of begin_ipnum and end_ipnum. But as far as i can tell i would not be able to query a custom property using GQL. So i'll try a computed ListProperty value for the range on object creation first. Will report back how i fare! – tijs Jul 27 '10 at 08:51
  • Is there an update on this question as this was answered 7.5 years ago? Is there a possible between query nowadays? – Malte Mar 27 '18 at 13:44
2

My solution doesn't follow the pattern you have requested, but I think it would work well on app engine. I'm using a list of strings of CIDR ranges to define the IP blocks instead of specific begin and end numbers.

from google.appengine.ext import db    
class Country(db.Model):
    subnets = db.StringListProperty()
    country_code = db.StringProperty()

c = Country()
c.subnets = ['1.2.3.0/24', '1.2.0.0/16', '1.3.4.0/24']
c.country_code = 'US'
c.put()

c = Country()
c.subnets = ['2.2.3.0/24', '2.2.0.0/16', '2.3.4.0/24']
c.country_code = 'CA'
c.put()

# Search for 1.2.4.5 starting with most specific block and then expanding until found    
result = Country.all().filter('subnets =', '1.2.4.5/32').fetch(1)
result = Country.all().filter('subnets =', '1.2.4.4/31').fetch(1)
result = Country.all().filter('subnets =', '1.2.4.4/30').fetch(1)
result = Country.all().filter('subnets =', '1.2.4.0/29').fetch(1)
# ... repeat until found
# optimize by starting with the largest routing prefix actually found in your data (probably not 32)
cope360
  • 6,195
  • 2
  • 20
  • 33
  • Neat idea. This might often require many round-trips to the datastore though - which is okay if you don't need to scale, or if this kind of lookup will be infrequent. (Each round-trip will cost at least 60-80ms). – David Underhill Jul 26 '10 at 22:12
  • My solution trades storage for speed vs your solution. Assuming we account for all addresses, yours will have about 3B list items. To reduce round-trips in mine you could do an IN query for all possibilities at once. – cope360 Jul 26 '10 at 22:49
  • An IN query would definitely help cut down on the run time - though this would make for a very expensive query since the IN query would be split into [up to] 30 sub-queries. (And an in-memory longest prefix match on the returned results if you wanted such a match; same with my solution.) That said, though expensive it is possible to execute on large address ranges :). (+1) – David Underhill Jul 26 '10 at 23:21
  • This is pretty smart but it's easier for me to wrap my head around David's solution so i'll try that first. The full dataset, ip ranges per country) that i'm using as my starting point is only about 130k entries by the way so it's not too bad. – tijs Jul 27 '10 at 08:55
  • @tijs my reference to items is the number IP addresses in all the lists total. You have 130k ranges but each range could contains tens of thousands of addresses. David's solution is much simpler to implement. Good luck! – cope360 Jul 27 '10 at 13:46