7

I'm trying to show a default value, "Others", when the query does not return any result for one of the selected columns. I'll show you the example.

This query returns an empty value for os(agent) SO (in the first row):

select country, os(agent) SO, count(*) from clicks_data
where country is not null and os(agent) is not null
group   by country, os(agent);

Output:

ZA           4
ZA  Android  4
ZA  Mac      8
ZA  Windows  5

Instead, I would like to get this result:

ZA  Others  4
ZA  Android 4
ZA  Mac     8
ZA  Windows 5

My next attempt was this query, but it's not really working, either:

select country, regexp_replace(os(agent),'','Others') SO, count(*) from clicks_data 
where country is not null and os(agent) is not null 
group by country, os(agent);

This is the result:

ZA  Others  4
ZA  OthersAOthersnOthersdOthersrOthersoOthersiOthersdOthers 4
ZA  OthersMOthersaOtherscOthers 8
ZA  OthersWOthersiOthersnOthersdOthersoOtherswOtherssOthers 5
William Price
  • 4,033
  • 1
  • 35
  • 54

5 Answers5

15

Use LENGTH() to check the length of the column value. It returns > 0, if there is some value else return 0 for empty or NULL value.

Also frame the column value in CASE WHEN ... END block

The final query may look like:

SELECT country, CASE WHEN LENGTH(os(agent)) > 0 THEN os(agent) ELSE 'Others' END AS SO, COUNT(*) 
FROM clicks_data 
WHERE country IS NOT NULL AND os(agent) IS NOT NULL 
GROUP BY country, os(agent);

Hope this help you!!!

Farooque
  • 3,616
  • 2
  • 29
  • 41
14

COALESCE will be the best suitable and optimum solution for your case

Syntax: COALESCE(VALUE,DEFAULT_VALUE): Function returns default value when values is null else VALUE;

Query

SELECT country, COALESCE(os(agent),'Others') AS SO, COUNT(*) 
FROM clicks_data 
WHERE country IS NOT NULL AND os(agent) IS NOT NULL 
GROUP BY country, os(agent);

Hope this would be the efficient solution for your problem.

harschware
  • 13,006
  • 17
  • 55
  • 87
Jignesh
  • 141
  • 1
  • 2
1

='' maybe the easiest way to go. e.g.

CASE WHEN col='' THEN xxx ELSE yyy END 
     AS col_new;
Sean.H
  • 640
  • 1
  • 6
  • 18
0

Another possible solution. Incase you would like to simply replace all the NULL values by an empty string when you export the data, you could do as such by feeding the sed command the output of your sql

$ hive -e 'set hive.cli.print.header=true; select * from db_name.table_name;' | sed 's/[\t]/,/g; s/^NULL,/,/g; s/,NULL,/,,/g; s/,NULL$/,/g;' > test.csv

Credit

Shankar ARUL
  • 12,642
  • 11
  • 68
  • 69
0

For anyone who has similar issues, I'd like to summarize it here.

Well, this is a relatively old question. The provided SQL filters out NULL, so all you need to handle is the empty string "". But this conflicts with the title, which specifically says both NULL and empty strings should be considered. So I'll stick with the title.

COALESCE and NVL works only for NULL, but it doesn’t work for the empty string "".

Both LENGTH and !="" (along with CASE WHEN) are feasible as they are compatible with NULL and "". Note, when one of the parameters of the = operation is NULL, it evaluates to NULL.

One more thing to note is we should make expressions in the GROUP BY clause conform to those of the SELECT clause. By this I mean, when you SELECT an expression like CASE WHEN..., you should GROUP BY the same expression CASE WHEN.

This leads to code repetition, which can be improved by position alias (possible since version 0.11.0). So the final query may be sth like this:

--Only needed for version 0.11 through 2.1.x. 
SET hive.groupby.orderby.position.alias = true;  

SELECT
    country, 
    CASE 
        WHEN os(agent)!="" THEN os(agent)  --This also implies that it's not NULL.
        ELSE 'Others' 
    END AS SO, 
    COUNT(*) 
FROM clicks_data 
WHERE country IS NOT NULL 
GROUP BY 
    1, 2
;
damientseng
  • 533
  • 2
  • 19