0

I'm trying to process some IP data that we store on our clickhouse database. We have some users who have IPv6 addresses logged, and some people have multiple IP addresses logged, so what I am trying to achieve is to get only the IPv4 addresses, and if there are multiple IP addresses listed, then I choose the first one logged.

Here is the query that I made to filter them:

SELECT IF(ip LIKE '%,%', arrayElement(splitByChar(',', assumeNotNull(ip)), 1), ip) AS ip
   FROM usage_analytics.users
   WHERE ip NOT LIKE '%:%' 

the results are not consistent. Sometimes it works fine, and gets all IPv4 addresses. However sometimes it returns null rows always at around 70 rows into the results. This happens around 4/5 times when you run this query.

What's going on? Is this a clickhouse issue, or a logic issue, or something else I'm not considering?

ewcvis
  • 139
  • 2
  • 11
  • 1
    try to handle NULL values explicitly - *SELECT if(isNull(ips), NULL, IF(ips LIKE '%,%', splitByChar(',', assumeNotNull(ips))[1], ips)) AS ip FROM ..*. – vladimir Aug 30 '21 at 22:13
  • Hmm its still showing nulls, it worked the first time I ran the query, but subsequent times it didn't – ewcvis Aug 30 '21 at 22:39

0 Answers0