4

I need to create a whitelist / blocklist for IPv6.

What in-memory options are there for maintaining information on a per-host or per-network basis in IPv6?

I currently use a HashTable<UInt32> for IPv4, but never really mastered the subnet tracking, CIDR etc. IPv6 has many different ways of expressing IP ranges, complicating my effort to group IPs together, and account for them.

That being said.. what is the most efficient way (in search speed, or in-memory compactness) to have such a blocklist / whitelist?

TL;DR Question

  • How do I find if a UInt128 is in a list/btree/hashtable? Which data structure is appropriate for this?

  • How do find IP's that are "near" each other. This is normally called CIDR, but we can also express this as a value comparison of a BigInt.

One approach that just came across my mind is how a cryptographic accumulator works. Maybe there is a way to leverage the "membership" abilities of an accumulator into the need to determine if a number is a member of a set

makerofthings7
  • 60,103
  • 53
  • 215
  • 448
  • 1
    I fear this might be too broad, as it's a huge (and probably not very well-charted) design space and the best option depends a lot on which kinds of ranges you need to support, what kind of rules you expect, how flexible you want the filter to be and how much effort you want to invest (some firewalls compile filter rules *to machine code*). –  Aug 11 '14 at 20:40
  • @delnan I'm thinking I'm only dealing with two situations here: 1- Is a UInt128 in a list, or not. 2- Doing a binary comparison of contigious `1` bits indicating similar netmask. Perhaps this would be a Tree of some type. – makerofthings7 Aug 11 '14 at 20:44
  • This might be of interest [Using Multiple Hash Functions to Improve IP Lookups](http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=6729BE6751B4FB873DD8589B6E221C87?doi=10.1.1.28.8601&rep=rep1&type=pdf). Seems that doing this right, for subnets at least, is quite complicated. And many use cases needs subnets, especially with IPv6, where one gets at least a /64 network. – cypres Aug 14 '14 at 11:58

1 Answers1

2

I am storing IPv6 addresses for whitelisting purposes in a std::pair<in6_addr, uint8_t>. That is a pair with the network and the CIDR bits.

When checking against the whitelist I simply iterate the associated networks, and doing CIDR matching with:

bool cidr6_match(const in6_addr &address, const in6_addr &network, uint8_t bits) {
#ifdef LINUX
  const uint32_t *a = address.s6_addr32;
  const uint32_t *n = network.s6_addr32;
#else
  const uint32_t *a = address.__u6_addr.__u6_addr32;
  const uint32_t *n = network.__u6_addr.__u6_addr32;
#endif
  int bits_whole, bits_incomplete;
  bits_whole = bits >> 5;         // number of whole u32
  bits_incomplete = bits & 0x1F;  // number of bits in incomplete u32
  if (bits_whole) {
    if (memcmp(a, n, bits_whole << 2)) {
      return false;
    }
  }
  if (bits_incomplete) {
    uint32_t mask = htonl((0xFFFFFFFFu) << (32 - bits_incomplete));
    if ((a[bits_whole] ^ n[bits_whole]) & mask) {
      return false;
    }
  }
  return true;
}

Adapted from xfrm's addr_match.

More on matching with CIDR: IP cidr match function

I'm excited to hear what others can come up with in terms of more exotic data types. For my own use case I have a bunch or so of IPs to validate against for each account, so a simple list suffice for me.

Community
  • 1
  • 1
cypres
  • 366
  • 2
  • 8
  • @makerofthings7 radix trees seem like a very good fit. It comes down to how they are implemented. I just learned about them too ;) – cypres Aug 15 '14 at 07:57