4

The following code appears to be the hottest spot in my program.

JAVA_OPTS=-Xprof output:

     Compiled + native   Method
  5.7%   173  +     0    scala.collection.IndexedSeqOptimized$class.slice
  5.1%   156  +     0    scala.collection.IndexedSeqOptimized$class.foreach
  2.9%    87  +     0    java.util.regex.Pattern$BmpCharProperty.match
  2.5%    76  +     0    scala.collection.IndexedSeqOptimized$class.sameElements
  2.4%    73  +     0    trafacct.SubNet.contains

Slice, sameElements and even foreach calls seem to be most used from here too. Can someone give an advice or two on how to optimize contains() method? Maybe some techniques allowing Bytes analysis without converting them to integers? Or solid whole-sequence approach without slice?

Function SubNet.contains() matches an IP address against subnet.

object SubNet {
    def toInts(bytes: Seq[Byte]): Seq[Int] = bytes.map(_.toInt & 0xFF)
}

case class SubNet(ip:InetAddress,  maskLength:Int) extends HostCategory {
    import SubNet.toInts
    private val bytes: Int = maskLength / 8
    private val subnet = toInts(ip.getAddress)
    private val bits = bytes * 8 - maskLength
    def contains(host: Host) = {
        if (host.ip == null && ip == null) {
            true
        } else if (this.ip == null) {
            false
        } else {
            val address = toInts(host.ip.getAddress)
            if (address.length != subnet.length) {
                false
            } else {
                if (address.slice(0, bytes) != subnet.slice(0, bytes)) {
                    false
                } else {
                    ((address(bytes) >> (8-bits) ^ subnet(bytes) >> (8-bits)) & 0xFF) == 0
                }
            }
        }
    }
}

I understand, that this optimization won't give me much better throughput, I just feel that I'm doing something wrong spending so much time inside this simple function.

This code should be IPv6 (16 bytes) compatible, and I don't like the idea of handling IPv4 case separately.

Basilevs
  • 22,440
  • 15
  • 57
  • 102
  • hmm, here is a implementation that you can use for testing your code, see their testcases http://docs.guava-libraries.googlecode.com/git-history/release09/javadoc/index.html – oluies Sep 05 '11 at 20:51

2 Answers2

3

You're not doing anything wrong per se; you're just using collections that are meant for ease of use not performance when handling primitives.

If you want to speed this up, you'll get the largest boost by switching to using arrays and while loops. It's not entirely clear to me that the code you wrote even works for IPv6 except for IPv4 addresses stored in IPv6 format, since you could have a subnet with more than 256 items. Also, by testing lengths you're assuming no mixed IPv6/IPv4 representations of the same address.

I'd forget the whole "toInts" thing and just store byte arrays; then do something like (warning, untested)

def contains(host: Host): Boolean = {
//...
  if (address.length != subnet.length) false
  else {
    var i = 0
    while (i<address.length-1) {
      if (address(i) != subnet(i)) return false
      i += 1
    }
    (address(i)&0xFF) >> (8-bits) ^ (subnet(i)&0xFF) >> (8-bits) == 0
  }
}

It's really not any more complicated than your original solution, and should run ~10x faster.

Rex Kerr
  • 166,841
  • 26
  • 322
  • 407
  • subnet(i)&&0xFF for Bytes is bothering me. Bytes are signed and shift of negative numbers acts strange : -1>>2 == -1. Won't this affect comparison? – Basilevs Sep 06 '11 at 01:01
  • I don't really understand your phrase about 255 items in subnet. Why can't there be more? – Basilevs Sep 06 '11 at 01:09
  • Indeed, elimination of toInt mapping and sequence slicing gives drastic improvement. Thank you. – Basilevs Sep 06 '11 at 01:20
  • As for last byte of subnet, I have to leave it converted: ((address(bytes).toInt >> (8-bits) ^ subnet(bytes).toInt >> (8-bits)) & 0xFF) == 0 – Basilevs Sep 06 '11 at 01:23
  • @Basilevs - Bytes are signed but all math is Int math. So `b & 0xFF` is an Int from 0 to 255. The `&&` was a typo. You want to do the bitmask before the shift. – Rex Kerr Sep 06 '11 at 09:54
  • this code does not calculate correctly, please see my error below. – Franz Bettag Dec 13 '12 at 01:43
1

With this code, it does not validate correctly.

For example:

scala> val ip = java.net.InetAddress.getByName("::ffff:1.2.176.0")
ip: java.net.InetAddress = /1.2.176.0

scala> val prefix = new InetPrefix(ip, 20)
prefix: InetPrefix = InetPrefix@6febf6f9

scala> prefix.contains(java.net.InetAddress.getByName("::ffff:1.2.176.20"))
res11: Boolean = true

scala> prefix.contains(java.net.InetAddress.getByName("::ffff:1.2.191.20"))
res12: Boolean = false

But if you calculate that network: (1.2.176.0/20)

$ sipcalc 1.2.176.0/20

-[ipv4 : 1.2.176.0/20] - 0

[CIDR]
Host address        - 1.2.176.0
Host address (decimal)  - 16953344
Host address (hex)  - 102B000
Network address     - 1.2.176.0
Network mask        - 255.255.240.0
Network mask (bits) - 20
Network mask (hex)  - FFFFF000
Broadcast address   - 1.2.191.255
Cisco wildcard      - 0.0.15.255
Addresses in network    - 4096
Network range       - 1.2.176.0 - 1.2.191.255
Usable range        - 1.2.176.1 - 1.2.191.254

-

I rewrote both (IPv4 and IPv6) in Scala put it for everyone on GitHub. It now also validates within ranges (so /20 etc will be regarded, which the old one did not do.)

You can find the code (i separated it into IPv4 and IPv6) at https://github.com/wasted/scala-util/blob/master/src/main/scala/io/wasted/util/InetPrefix.scala

I also created a blogpost about this.

Franz Bettag
  • 437
  • 2
  • 8
  • 1
    I never claimed it did validate correctly. I was just showing the OP how to get their (probably broken) code to run faster. If you can fix it, that would make for a good answer--otherwise your "answer" is a bit too much like a comment! – Rex Kerr Dec 13 '12 at 15:48
  • I had to write 2 ways of checking it, IPv6 was your way while IPv4 relies on calculating the start/end of the range as Longs and then compare the value. Both are very straight forward. Please feel free to submit suggestions. :) – Franz Bettag Dec 14 '12 at 17:57