15

In Oracle, what is the appropriate data type or technique for representing network addresses, which addresses may be IPv4 or IPv6?

Background: I'm converting a table recording network activity, built using the PostgreSQL inet data type to hold both v4 and v6 addresses in the same table.

No row contains both v4 and v6 addresses, however. (That is, a record is either from a machine's v4 stack, or a machine's v6 stack.)

pilcrow
  • 56,591
  • 13
  • 94
  • 135
  • 1
    Do you make aggregations/search on that table by IP address? how many rows do you expect to have in 1 year? – jachguate Feb 09 '11 at 21:07
  • Aggregations and searching by IP, yes. Number of rows, possibly hundreds of millions. (Do you have small/medium/large recommendations?) – pilcrow Feb 09 '11 at 21:27
  • please, use the @jachguate notation in your comments if you want me to get notified about it. For what you say, I think the better approach is the first of @Alain answer. – jachguate Feb 10 '11 at 15:53

7 Answers7

16

In Oracle, what is the appropriate data type or technique for representing network addresses, which addresses may be IPv4 or IPv6

There are two approaches :

  1. storing only.
  2. storing the conventional representation

For storing only. An IPV4 address should be an integer (32bits are enough). For IP V6, 128 bits, INTEGER (which is similar to Number(38)) will do. Of course, that's storing. That approach takes the view that the representation is a matter for the application.

If one take the opposite strategy, of storing the conventional representation, one needs to make sure that IP V4 and IPV6 addresses have only one conventional (string) representation. It's well-known for ipV4. As for IPV6, there is also a standard format.

My preference goes to the first strategy. In the worst case, you can adopt an hybrid approach (non acid though) and store both the binary and the ascii representation side by side with "priority" to the binary value.

No row contains both v4 and v6 addresses, however.

The standard representation of a IPV4 address in IPV6 format is : ::ffff:192.0.2.128.

I don't know the context but I would however reserve 2 columns, one for IPV4 and the other for a distinct ipV6 address.

Update
Following a good comment by @sleepyMonad's, I'd like to point out that instead of the Number data type it is preferable to use the INTEGER data type, which will happily accommodate the highest possible value that can be expressed with a 128 bits integer 'ff...ff' (which would need 39 decimal digits). 38 is the highest power of ten ranging from 0 to 9 that can be encoded on 128 bits but one can still insert the maximum unsigned value for 2**128 - 1 (decimal 340282366920938463463374607431768211455). Here is a small test to illustrate this possibility.

create table test (
  id integer primary key,
  ipv6_address_bin INTEGER );

-- Let's enter 2**128 - 1 in the nueric field
insert into test (id, ipv6_address_bin) values ( 1, to_number ( 'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF', 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX') ) ;

-- retrieve it to make sure it's not "truncated".
select to_char ( ipv6_address_bin, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' ) from test where id = 1 ;
-- yields 'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF'

select to_char ( ipv6_address_bin ) from test where id = 1 ;
-- yields 340282366920938463463374607431768211455

select LOG(2, ipv6_address_bin) from test where id = 1 ;
-- yields 128

select LOG(10, ipv6_address_bin) from test where id = 1 ;
-- yields > 38
Alain Pannetier
  • 9,315
  • 3
  • 41
  • 46
  • Definitely agree with separate columns for IPv4 and IPv6 values. – Gary Myers Feb 09 '11 at 22:01
  • Would you also constrain the v4 and v6 columns so that one and only one of them had to be NULL? (Something like `CHECK((src_v4 IS NULL and src_v6 IS NOT NULL) or (src_v4 IS NOT NULL and src_v6 IS NULL))`) ? – pilcrow Feb 10 '11 at 15:40
  • It all depends on what the table data is supposed to contain. Assuming the IP address is not the key (seems obvious from your question), and you are tracking users, then I would keep the door open because people could connect from various IP addresses. If instead, your table tracks connection events rather than users (presumably with a one to n relationship), then yes, because you connect either in IP v4 or in V6 (even though, as I said, IP v4 addresses have a V6 representation). – Alain Pannetier Feb 10 '11 at 15:48
  • 1
    If you have ipv4 OR ipv6 but never both, store ipv4 addresses using their ipv6 equivalent – ThiefMaster Feb 10 '11 at 15:49
  • 1
    @ThiefMaster. Yes, indeed, because the ::ffff:a.b.c.d range is reserved precisely for this case. – Alain Pannetier Feb 10 '11 at 15:52
  • 1
    @Alain Pannetier: as 2^128 has 39 decimal digits, it's not obvious how this fits in NUMBER(38). – sleepyMonad Jun 06 '11 at 07:33
  • @sleepyMonad. I've added sample sql code to illustrate the possibility to encode ffff:ffff:.......ffff ipv6 addresses in an INTEGER field (you're right in some way it is easier with INTEGER than NUMBER(38) but INTEGER itself is also 128 bits). NUMBER(38) will not work as smoothly as INTEGER. Correcting. – Alain Pannetier Jun 06 '11 at 11:38
  • @Alain Pannetier. This is completely wrong. In Oracle the Integer Datatype is a Synonym vor NUMBER(38). SQL> create table test (id integer primary key,ipv6_address_bin INTEGER ); Table created. SQL> desc test Name Null? Type ----------------------------------------- -------- ---------------------------- ID NOT NULL NUMBER(38) IPV6_ADDRESS_BIN NUMBER(38) SQL> drop table test; Table dropped. SQL> – miracle173 Oct 20 '11 at 12:29
  • @miracle173. I tried the code above (in the answer) with both INTEGER and NUMBER(38). Worked like a breeze with INTEGER but not with NUMBER(38). 11gR2 IIRC. Did you try it. Does it work for 11gR2? – Alain Pannetier Oct 20 '11 at 14:02
  • 3
    @Alain Pannetier. I tried your code and you are right, it worked. the Output of the describe-functions seems to be wrong. i queried the user_tab_columns and it gives me for the data_precision NULL for and INTEGER column and 38 for an NUMBER(38) column. I will investigate this further when I have more time. I think you should not rely on this undocumented feature. – miracle173 Oct 24 '11 at 22:42
  • i tested this with a sequence of numbers and could not find any ip's that were not representable – ShoeLace Aug 22 '13 at 05:01
  • Be aware if you use such values in any PL/SQL code. `i INTEGER := 2**128-1;` (i.e. `ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff`) raises an **ORA-06502: PL/SQL: numeric or value error: number precision too large**. `INTEGER` as column data type is different to `INTEGER` PL/SQL data type! – Wernfried Domscheit May 18 '18 at 06:03
8

Store it in RAW.

RAW is variable-length byte array, so....

  • just treat the IPv4 as an array of 4 bytes
  • and IPv6 as an array of 16 bytes

...and store either one of them in directly in RAW(16).


RAW can be indexed, be a PK, UNIQUE or FOREIGN KEY, so you can do anything you normally could with VARCHAR2 or INT/NUMBER/DECIMAL, but with less conversion and storage overhead.

To illustrate the storage overhead of INT over RAW, consider the following example:

CREATE TABLE IP_TABLE (
    ID INT PRIMARY KEY,
    IP_RAW RAW(16), 
    IP_INT INT
);

INSERT INTO IP_TABLE (ID, IP_RAW, IP_INT) VALUES (
    1,
    HEXTORAW('FFFFFFFF'),
    TO_NUMBER('FFFFFFFF', 'XXXXXXXX')
);

INSERT INTO IP_TABLE (ID, IP_RAW, IP_INT) VALUES (
    2,
    HEXTORAW('FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF'),
    TO_NUMBER('FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF', 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX')
);

SELECT VSIZE(IP_RAW), VSIZE(IP_INT), IP_TABLE.*  FROM IP_TABLE;

The result (under Oracle 10.2):

table IP_TABLE created.
1 rows inserted.
1 rows inserted.
VSIZE(IP_RAW)          VSIZE(IP_INT)          ID                     IP_RAW                           IP_INT                 
---------------------- ---------------------- ---------------------- -------------------------------- ---------------------- 
4                      6                      1                      FFFFFFFF                         4294967295             
16                     21                     2                      FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF 340282366920938463463374607431768211455 
Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167
4

@Alain Pannetier (because I can't comment yet): The ANSI INTEGER datatype maps to NUMBER(38) in Oracle according to http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/sql_elements001.htm#i54335. Below the table you find the info that NUMBER only provides 126bit binary precision which is not enought for a 128bit IPv6 address. The maximum value might store fine but there will be addresses that are rouned to the next lower one.

The internal numeric format is ROUND((length(p)+s)/2))+1 (http://download.oracle.com/docs/cd/B19306_01/server.102/b14220/datatype.htm#i16209).

Update: After fiddling around with the issue again I've now found a solution that allows high performance querying of networks that contain an IPv6 address: store the IPv6 addresses and subnet masks in RAW(16) columns and compare them using UTL_RAW.BIT_AND:

SELECT name, DECODE(UTL_RAW.BIT_AND('20010DB8000000000000000000000001', ipv6_mask), ipv6_net, 1, 0)
FROM ip_net
WHERE ipv6_net IS NOT NULL;
Alexander Hartmaier
  • 2,178
  • 12
  • 21
1

you could also use a custom oracle object.

SQL>set SERVEROUTPUT on
SQL>drop table test;

Table dropped.

SQL>drop type body inaddr;

Type body dropped.

SQL>drop type inaddr;

Type dropped.

SQL>create type inaddr as object
  2  ( /* TODO enter attribute and method declarations here */
  3  A number(5),
  4  B number(5),
  5  C number(5),
  6  D number(5),
  7  E number(5),
  8  F number(5),
  9  G number(5),
 10  H NUMBER(5),
 11  MAP MEMBER FUNCTION display RETURN VARCHAR2,
 12  MEMBER FUNCTION toString( SELF IN INADDR , CONTRACT BOOLEAN DEFAULT TRUE) RETURN VARCHAR2,
 13  CONSTRUCTOR FUNCTION INADDR(SELF IN OUT NOCOPY INADDR, INADDRASSTRING VARCHAR2)  RETURN SELF AS RESULT
 14  
 15  ) NOT FINAL;
 16  /

SP2-0816: Type created with compilation warnings

SQL>
SQL>
SQL>CREATE TYPE BODY INADDR AS
  2  
  3  MAP MEMBER FUNCTION display RETURN VARCHAR2
  4  IS BEGIN
  5  return tostring(FALSE);
  6  END;
  7  
  8  
  9  MEMBER FUNCTION TOSTRING( SELF IN  INADDR , CONTRACT BOOLEAN DEFAULT TRUE) RETURN VARCHAR2 IS
 10  IP4 VARCHAR2(6) := 'FM990';
 11  ip6 varchar2(6) := 'FM0XXX';
 12    BEGIN
 13  IF CONTRACT THEN
 14    ip6 := 'FMXXXX';
 15  end if;
 16  
 17  IF CONTRACT AND A =0 AND B=0 AND C = 0 AND D=0 AND E =0 AND F = 65535 THEN --ipv4
 18      RETURN  '::FFFF:'||TO_CHAR(TRUNC(G/256),'FM990.')||TO_CHAR(MOD(G,256),'FM990.')||TO_CHAR(TRUNC(H/256),'FM990.')||TO_CHAR(MOD(H,256),'FM990');
 19  ELSE
 20      RETURN
 21  TO_CHAR(A,ip6)||':'||
 22  TO_CHAR(B,IP6)||':'||
 23  TO_CHAR(C,ip6)||':'||
 24  TO_CHAR(D,ip6)||':'||
 25  TO_CHAR(E,ip6)||':'||
 26  TO_CHAR(F,ip6)||':'||
 27  TO_CHAR(G,ip6)||':'||
 28  TO_CHAR(H,ip6);
 29  end if;
 30    end;
 31  
 32      CONSTRUCTOR FUNCTION inaddr(SELF IN OUT NOCOPY inaddr, inaddrasstring VARCHAR2)
 33                                 RETURN SELF AS RESULT IS
 34      begin
 35          if instr(inaddrasstring,'.') > 0 then
 36            --ip4
 37  null;
 38              a := 0;
 39              B := 0;
 40              C := 0;
 41              D := 0;
 42              E := 0;
 43              F := TO_NUMBER('FFFF', 'XXXX');
 44              G := TO_NUMBER(TO_CHAR(TO_NUMBER(REGEXP_SUBSTR(INADDRASSTRING,'([0-9]{1,3}).',1,1,'i',1),'999'),'FM0X')
 45  ||TO_CHAR(TO_NUMBER(REGEXP_SUBSTR(INADDRASSTRING,'([0-9]{1,3}).',1,2,'i',1),'999'),'FM0X')
 46  ,'XXXX');
 47              h := TO_NUMBER(TO_CHAR(TO_NUMBER(REGEXP_SUBSTR(INADDRASSTRING,'([0-9]{1,3}).',1,3,'i',1),'999'),'FM0X')
 48  ||TO_CHAR(TO_NUMBER(REGEXP_SUBSTR(INADDRASSTRING,'([0-9]{1,3})',1,4,'i',1),'999'),'FM0X')
 49  ,'XXXX');
 50  
 51          ELSIF instr(inaddrasstring,':') > 0 then
 52              --ip6
 53              a := TO_NUMBER(REGEXP_SUBSTR(inaddrasstring,'([0-9a-fA-F]{1,4})',1,1,'i',1),'XXXX');
 54              b := TO_NUMBER(REGEXP_SUBSTR(inaddrasstring,'([0-9a-fA-F]{1,4})',1,2,'i',1),'XXXX');
 55              c := TO_NUMBER(REGEXP_SUBSTR(inaddrasstring,'([0-9a-fA-F]{1,4})',1,3,'i',1),'XXXX');
 56              d := TO_NUMBER(REGEXP_SUBSTR(inaddrasstring,'([0-9a-fA-F]{1,4})',1,4,'i',1),'XXXX');
 57              E := TO_NUMBER(REGEXP_SUBSTR(inaddrasstring,'([0-9a-fA-F]{1,4})',1,5,'i',1),'XXXX');
 58              f := TO_NUMBER(REGEXP_SUBSTR(inaddrasstring,'([0-9a-fA-F]{1,4})',1,6,'i',1),'XXXX');
 59              g := TO_NUMBER(REGEXP_SUBSTR(inaddrasstring,'([0-9a-fA-F]{1,4})',1,7,'i',1),'XXXX');
 60              H := TO_NUMBER(REGEXP_SUBSTR(inaddrasstring,'([0-9a-fA-F]{1,4})',1,8,'i',1),'XXXX');
 61          end if;
 62  
 63          RETURN;
 64      END;
 65  end;
 66  /

Type body created.

SQL>
SQL>create table test
  2  (id integer primary key,
  3  address inaddr);

Table created.

SQL>
SQL>select * from test;

no rows selected

SQL>
SQL>
SQL>insert into test values (1, INADDR('fe80:0000:0000:0000:0202:b3ff:fe1e:8329') );

1 row created.

SQL>INSERT INTO TEST VALUES (2, INADDR('192.0.2.128') );

1 row created.

SQL>insert into test values (3, INADDR('20.0.20.1') );

1 row created.

SQL>insert into test values (4, INADDR('fe80:0001:0002:0003:0202:b3ff:fe1e:8329') );

1 row created.

SQL>insert into test values (5, INADDR('fe80:0003:0002:0003:0202:b3ff:fe1e:8329') );

1 row created.

SQL>INSERT INTO TEST VALUES (6, INADDR('fe80:0003:0001:0003:0202:b3ff:fe1e:8329') );

1 row created.

SQL>INSERT INTO TEST VALUES (7, INADDR('fe80:0003:0001:0003:0202:b3ff:fe1e:8328') );

1 row created.

SQL>INSERT INTO TEST VALUES (8, INADDR('dead:beef:f00d:cafe:dea1:aced:b00b:1234') );

1 row created.

SQL>
SQL>COLUMN INET_ADDRESS_SHORT FORMAT A40
SQL>column inet_address_full format a40
SQL>
SQL>select t.address.toString() inet_address_short, t.address.display( ) inet_address_full
  2  from test T
  3  order by t.address ;

INET_ADDRESS_SHORT                       INET_ADDRESS_FULL
---------------------------------------- ----------------------------------------
::FFFF:20.0.20.1                         0000:0000:0000:0000:0000:FFFF:1400:1401
::FFFF:192.0.2.128                       0000:0000:0000:0000:0000:FFFF:C000:0280
DEAD:BEEF:F00D:CAFE:DEA1:ACED:B00B:1234  DEAD:BEEF:F00D:CAFE:DEA1:ACED:B00B:1234
FE80:0:0:0:202:B3FF:FE1E:8329            FE80:0000:0000:0000:0202:B3FF:FE1E:8329
FE80:1:2:3:202:B3FF:FE1E:8329            FE80:0001:0002:0003:0202:B3FF:FE1E:8329
FE80:3:1:3:202:B3FF:FE1E:8328            FE80:0003:0001:0003:0202:B3FF:FE1E:8328
FE80:3:1:3:202:B3FF:FE1E:8329            FE80:0003:0001:0003:0202:B3FF:FE1E:8329
FE80:3:2:3:202:B3FF:FE1E:8329            FE80:0003:0002:0003:0202:B3FF:FE1E:8329

8 rows selected.

SQL>spool off

i just put this together in the last hour (and taught myself objects at the same time) so im sure it can be improved upon. if i make updates i'll repost them here

ShoeLace
  • 3,476
  • 2
  • 30
  • 44
  • According [RFC 5952](https://tools.ietf.org/html/rfc5952#section-4.2.1) address `FE80:0:0:0:202:B3FF:FE1E:8329` shall be represented as `fe80::202:b3ff:fe1e:8329` – Wernfried Domscheit May 18 '18 at 06:14
  • fair point.. an additional regexp_replace in the toString method would fix that.. `regexp_replace('FE80:0:0:0:202:B3FF:FE1E:8329',':[0:]*:','::')` – ShoeLace May 31 '18 at 05:26
  • 1
    it would howveer get `2001:db8:0:0:1:0:0:1` "wrong" as only set of 0's can be reduced (and it should be the longest set) – ShoeLace May 31 '18 at 05:31
  • I think you cannot short series of `0:0` to `::` with Regex only - at least not in Oracle because it does not support backreference, see this solution: https://stackoverflow.com/questions/7043983/ipv6-address-into-compressed-form-in-java/7044170#7044170 – Wernfried Domscheit May 31 '18 at 06:02
  • I wrote this generic package: https://stackoverflow.com/questions/43221720/oracle-pl-sql-how-to-calculate-range-ip-for-ipv6-cidr/43226773?noredirect=1#43226773 Function `Canonical_IPv6` make a proper output. – Wernfried Domscheit May 31 '18 at 06:09
1

I would prefer store IP addresses just in string, in format, returned by SYS_CONTEXT ('USERENV', 'IP_ADDRESS')

In refference of SYS_CONTEXT in 11g are described only default return value length as 256 bytes and does not described return value size for exacly 'IP_ADDRESS' context.

In document Oracle Database and IPv6 Statement of Direction described:

Oracle Database 11g Release 2 supports the standard IPv6 address notations specified by RFC2732. A 128bit IP address is generally represented as 8 groups of 4 hex digits, with the “:” symbol as the group separator. The leading zeros in each group are removed. For example, 1080:0:0:0:8:800:200C:417A would be a valid IPv6 address. One or more consecutive zero fields can optionally be compressed with the “::” separator. For example, 1080::8:800:200C:417A.

From this notes I prefer to make column IP_ADDRESS varchar2(39) to allow store 8 group by 4 digits and 7 separators between this groups.

Nashev
  • 490
  • 4
  • 10
  • Using VARCHAR2 can create some trouble. `1080::8:800:200C:417A`, `1080::8:800:32.12.65.122`, `1080::8:800:200c:417a`, `1080:0:0:0:8:800:200C:417A`, `1080:0000:0000:0000:0008:0800:200C:417A` are all legal representations of the same IPv6 IP-Address (see [RFC 5952](https://tools.ietf.org/html/rfc5952#section-1) and [RFC 4291](https://tools.ietf.org/html/rfc4291#section-2.2)) . Unless you enforce a certain format you have problem if you have to use it in `WHERE` clause. – Wernfried Domscheit May 16 '18 at 08:08
  • Depending on format VARCHAR2(39) is not sufficient, in worst case you may have `1080:1234:1234:1234:0008:5800:132.212.165.122` which is 45 characters long. – Wernfried Domscheit May 16 '18 at 08:10
  • For addresses from an Oracle, it's enough if they consistently follow their promise from the quote above. For addresses from other sources You are right, thanks. – Nashev May 16 '18 at 08:47
0

The possibilities are:

  • Store as string, i.e. VARCHAR2 (example 1080::8:800:200c:417a)
  • Store as numeric value
    • NUMBER data type
    • INTEGER data type
  • Store as RAW value
    • One RAW value, i.e. RAW(4) or RAW(16) for IPv4 or IPv6 respectively
    • 4 x RAW(1) or 8 x RAW(2) for IPv4 or IPv6 respectively

I would recommend to use RAW values because

  • If you use strings then you have to consider different formats of IPv6.

    1080::8:800:200C:417A
    1080::8:800:200c:417a
    1080::8:800:32.12.65.122
    1080:0:0:0:8:800:200C:417A
    1080:0:0:0:0008:0800:200C:417A
    1080:0000:0000:0000:0008:0800:200C:417A
    

    are all legal representations of the same IPv6 IP-Address. Your application needs to enforce a common format for proper usage, e.g. use in WHERE condition.

  • NUMBER/INTEGER values are senseless without conversion to human-readable format. You cannot use INTEGER data type in PL/SQL

    i INTEGER := 2**128-1; -- i.e. ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
    
    -> ORA-06502: PL/SQL: numeric or value error: number precision too large. 
    
  • In case you have to work with subnetting you cannot use function BITAND - it also supports numbers only up to 2^127

  • You can use UTL_RAW functions UTL_RAW.BIT_AND, UTL_RAW.BIT_COMPLEMENT, UTL_RAW.BIT_OR for subnet operations.

  • In case you have to deal with really big amount of data (I am talking about billions of rows) it might be beneficial to split the IP-Address into several RAW values, i.e. 4 x RAW(1) or 8 x RAW(2). Such columns would be predestinated for Bitmap-Indexes and you would save a lot of disc space and gain performance.

Wernfried Domscheit
  • 54,457
  • 9
  • 76
  • 110
  • _[RFC 5952, A Recommendation for IPv6 Address Text Representation](https://tools.ietf.org/html/rfc5952)_ explains that you should accept all valid IPv6 text representation, but you should only present the single IPv6 text representation outlined in the RFC. – Ron Maupin May 31 '18 at 07:41
0

The Oracle documentation does state INTEGER is an alias to NUMBER(38), but that is probably a typo, because the paragraph above it states:

NUMBER(p,s) where: p is the precision... Oracle guarantees the portability of numbers with precision of up to 20 base-100 digits, which is equivalent to 39 or 40 decimal digits depending on the position of the decimal point.

So NUMBER can store 39 to 40 digits, and INTEGER is likely an alias to NUMBER(max precision) instead of NUMBER(38). There is why the example provided works (and it works if you change INTEGER to NUMBER).

Peter O.
  • 32,158
  • 14
  • 82
  • 96