I'm trying to build a Kylin Cube that will get me the Average of a SmallInt nullable column Col1. I expect any null value to not be counted towards this average. When I work on my table in Hive this is exactly the result that I'm getting when applying the AVG function. However when I build the cube in Kylin on top of this table all the null values in my measurements columns are becoming zeros and then counted towards the calculation of the AVG. Is there a way to force Kylin to treat nulls as nulls?
Asked
Active
Viewed 382 times
1 Answers
1
You have encountered a defect of Apache Kylin. I created a JIRA according to your question. https://issues.apache.org/jira/browse/KYLIN-2049
As a workaround, you can add a new dimension ie VALID, let it be 1 when Col1 is not null and be 0 when Col1 is null. Then query "AVG(Col1) where VALID=1" shall give the expected result.

Li Yang
- 284
- 1
- 6
-
Thanks Li. The workaround is a bit tricky because if I need to expose more than one average then each one is becoming a subquery, making the syntax awkward and the system performance worst but if it is a bug in Kylin then there's not a lot we can do at the moment. – Itay Sep 26 '16 at 08:25