1

Hello I am running Apache Kylin apachekylin/apache-kylin-standalone:3.0.0-alpha2 Docker image

I started out by creating two Hive tables one to record store sales and one consisting of store metadata

CREATE TABLE IF NOT EXISTS STORESALES (
id INT,
food FLOAT,
drugs FLOAT,
cosmetic FLOAT,
baby FLOAT,
reportdate DATE);

CREATE TABLE IF NOT EXISTS STOREMETA (
id INT,
address STRING,
brand STRING,
owner STRING);

I then created a model in which I declared STORESALES as my fact table and STOREMETA as a lookup table with left join STORESALES.ID = STOREMETA.ID I then declare

  • STOREMETA.ID
  • STOREMETA.ADDRESS
  • STOREMET.OWNER
  • STOREMETA.BRAND

as dimensions. I explicitly deleted STORESALES.ID I also specify measures

  • STORESALES.DRUGS
  • STORESALES.BABY
  • STORESALES.COMSETICS
  • STORESALES.FOOD

Here's what the model looks like

and also specified STORESALES.REPORTDATE as my partition

So then I go on to set up my cube. Again I add STOREMETA[ID, BRAND, OWNER, NAME] as dimensions, but for some reason STOREDATA.ID shows up as a choice for a dimension as well. I add measures as MAX_FOOD, MAX_DRUGS, MAX_COSMETICS, MAX_BABY. The issues is once I get to Advanced Settings the only options for grouping available are STORESALES.ID. If I manually enter anything else it disappears from the list. I went back to edit the model and noticed that STORESALES.ID is now in the list of measures as well.

STORESALES.ID is back

Not sure if this is what breaking things for me. Or if my general lack of experience here is hindering my progress. Please assist.

pu239ppy
  • 129
  • 1
  • 9

1 Answers1

0

It turns out the issues is that by default all dimensions in a cube are considered to be derived. That is derived is is selected when I select a measure. It looks like a grouping cannot be created with derived dimensions.

pu239ppy
  • 129
  • 1
  • 9
  • Hopefully this helps someone, I am not an experienced Data Warehousing person, and I did not find anything in the manual that explained it – pu239ppy Nov 14 '19 at 22:41
  • Hi there is a statement on "normal" and "derived" in the tutorial (https://kylin.apache.org/docs/tutorial/create_cube.html), but I agree with you that it is not clear enough, especially for the new users. Do you think whether it can help if we add some more statements for what it really means? – ShaoFeng Shi Dec 01 '19 at 02:10
  • @ShaoFengShi I think it would be very helpful to link from the step-by-step tutorial to pertinent sections of the "Technical Concepts" doc. However the most complete description of a derived type seems to be in http://kylin.apache.org/docs/howto/howto_optimize_cubes.html. So maybe just a matter of restructuring the doc – pu239ppy Dec 10 '19 at 13:53