0

I want to extract only first two octets of IP address in hive. Can anybody please tell me equivalent Regex supported in Hive?

For example,extract '192.96.0.0' from ip_address '192.96.45.33'.

rupali
  • 165
  • 1
  • 3
  • 11
  • What does your data field look like? Not sure why you would need to use RLIKE, when you can just use like '192.96%'. – Michael Dec 30 '14 at 22:15

1 Answers1

1
192\\.96\\.\\d{3}\\.\\d{3}

Guess this should work as Hive uses Java format.

or

192\\.96\\.(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.)(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
vks
  • 67,027
  • 10
  • 91
  • 124
  • Can anybody suggest how to extract first two octets from ip address in hive. For IP '192.72.45.56' , i want to extract 192.72.0.0. How to achieve this? – rupali Jan 20 '15 at 12:51
  • @rupali you can extract `(?<!\\.)\\d{3}\\.\\d{3}\\.` this and append `0.0` later or you can use `replace` and replace `\\b\\d{3}\\.\\d{3}\\b(?!\\.)` with `0.0`. – vks Jan 20 '15 at 12:57
  • select ip_address, regexp_replace(ip_address,'\\b\\d{3}\\.\\d{3}\\b(?!\\.)','0.0') as first_two_octet from report limit 10; I tried this query it is returning ip_address instead of first two octets. Sorry i dont understand regex much – rupali Jan 20 '15 at 13:08