9

Let's say there is a table having only one field. Table is named address and has a field named ip which contains an IPV4 address as its value

Sample Data

192.168.120.201
192.168.120.202
192.168.120.203
192.168.120.204
192.168.120.205
192.168.121.3
192.168.121.50

I need to run a query on this table which will return data COUNT on The First Three Octets

Expected Output

network count

192.168.120 5

192.168.121 3

I tried using SUBSTR like

SELECT SUBSTR(ip,1,10) as network,COUNT(*) as c FROM address GROUP BY network HAVING(c>1)

But the problem is that this SUBSTR will only work as expected if all the first 3 Octets have 3 digits each, but this will break on any ip address which does not have 3 digits each in first three octets. For example this will not work for

192.168.0.0

192.2.3.50

192.23.4.60

Question

Is there any alternate to the above query which will work in all the cases above?

Eric C
  • 3,886
  • 3
  • 30
  • 26
Hanky Panky
  • 46,730
  • 8
  • 72
  • 95

3 Answers3

13

Don't do string operations. you'd be better off converting the IPs to ints and using some bitmask, e.g.

SELECT INET_NTOA(INET_ATON(ipfield) & 0xFFFFFF00)
Hanky Panky
  • 46,730
  • 8
  • 72
  • 95
Marc B
  • 356,200
  • 43
  • 426
  • 500
  • Great answer, but you still need to trim the trailing octet. – Álvaro González Apr 04 '13 at 16:51
  • that'd be a display thing. since the final octet's been zeroed-out, you can group on the result of the inet transform. e.g. they'd all be `x.x.x.0` – Marc B Apr 04 '13 at 16:54
  • Thanks Marc for this Elegant answer. Just like you, I expected this one to perform faster than the two answers given below. However on same data this query is taking `0.28` seconds whereas the `substring_index` one is taking `0.15` seconds. Is there any catch I am missing? – Hanky Panky Apr 04 '13 at 17:20
  • probably the double-string round tripping, since you're making mysql do string->int->masked->string, rather than just string->string. Personally, if I'm storing ipv4 addresses, i'll store them as an int, not a varchar. 4bytes v.s. 15. Also makes it easier to this sort of masking/ranging directly in the db, rather than having to do string hacking. – Marc B Apr 04 '13 at 17:28
  • 1
    Thanks! that makes sense completely. Since my focus is more on counts rather than displaying the octets, I think I can use this solution without `INET_NTOA`. This then performs faster than `substring_index`. `INT` representation will do as good as `VARCHAR`. Thanks again! – Hanky Panky Apr 04 '13 at 17:40
  • sure. if you're just grouping on the masked value, then grouping on the int value works just the same as on the stringified version. – Marc B Apr 04 '13 at 17:42
6

You could use substring_index to do this:

SELECT substring_index(network, '.', 3) AS Octet,
       COUNT(*)
  FROM address
  GROUP BY Octet

Here's a SQLFiddle example

Martin
  • 16,093
  • 1
  • 29
  • 48
  • Thanks Martin, Great answer. +1, All the three answers given are valid, will decide the accepted one based on performance after running some tests – Hanky Panky Apr 04 '13 at 17:23
  • I like this answer, because of it's simplicity. It can be extended easily to 2 octets! – Laoneo Mar 02 '15 at 10:07
5

I would suggest using SUBSTRING_INDEX for this:

SELECT SUBSTRING_INDEX(ip, '.', 3) as network, COUNT(*) as c
FROM address
GROUP BY network
HAVING(c>1)
LIMIT 500
Mike Brant
  • 70,514
  • 10
  • 99
  • 103