Questions tagged [snowflake-schema]

**Snowflake schema** is special case of the database Star Schema, where one or many dimension tables are normalized. This is often done for improving the performance in some cases of the Star Schema. When the dimension tables are completely normalized, the resulting structure resembles a snowflake with the fact tables in the middle.

Snowflake schema is a logical arrangement of tables in a multidimensional database in which the dimensions are expressed by more than one table. The dimension tables that relate to the facts table represent the smallest granularity of data-analysis. The other tables in the same dimension represent aggregators of dimension's individual elements. When a dimension is aggregated by several independent criteria, the diagram resembles the shape of a snowflake. Used in data warehouses.

Methods for creating Snowflake Schema out of Star Schema:

  • Partially or fully normalize one or many dimension tables, leaving other dimensions unchanged.
  • Partially or fully normalize all dimension tables.

When to use

Query performance takes the highest significance in a data warehouse and snowflaking can hamper the performance. It very useful for cases, when some attributes like demographic attributes are browsed more then the whole customer table itself. Another case is for precalculated summaries derived from a fact table. This helps not aggregating over and over again since the operation is expensive.

Advantages

Savings in the storage space. Normalized structures are having no insertion, deletion or update anomalies. Easier to update and maintain.

Disadvantages

Schemas are less intuitive, with higher complexity and abstraction. Difficult browsing through the contents for users. Additional joins - worse query performance in some cases.

Example

Having a fact table called "Sales" and dimension table called "Product". Assume there are 500000 product dimension rows. These products can fall under 10 brands. For running query for products, not indexed on brands, the query will have to search through 500000 rows to find all brands.

On the other hand, if the "Product" table partially normalized, separated on another table "Brand" the initial search query would need to go through just 10 rows on.

External links

893 questions
2
votes
1 answer

How can I have a custom order in Snowflake?

In Snowflake, how do I define a custom sorting order. To try to explain what I want consider this table: ID Language Text 0 ENU a 0 JPN b 0 DAN c 1 ENU d 1 JPN e 1 DAN f 2 …
2
votes
1 answer

Snowflake: CTE with clause Insert statement error

I am using CTE with clause in Snowflake and trying to insert records into the target table ent.P_ACCOUNTS_EXTRACT_SHR. But a compilation error throwing while executing the query. Error: syntax error line 27 at position 4 unexpected 'INSERT'. I am…
2
votes
3 answers

How to get definition of all the VIEWS in a Snowflake database?

The ask is to get the definition of all the views from Production environment and refresh the lower environment. I assume, GET_DDL in a loop will suffice the need but not sure how to implement it. Please advise.
2
votes
4 answers

Snowflake check if table exist

Is there any in-built function or procedure in Snowflake which can return boolean value to indicate whether a table exists or not? Like cal IF_table_exist('table_name') or select iftableexist('table_name'); If not then I am planning to write a…
danD
  • 666
  • 1
  • 7
  • 29
2
votes
1 answer

The merge into command on Snowflake loads only null values when loading data from S3

I'm having issues when running the command MERGE INTO on Snowflake. The data is located in a bucket on S3. The files format are .snappy.parquet. The command runs well, it identifies the files in S3, but it loads only NULL values to the table. The…
ultraInstinct
  • 4,063
  • 10
  • 36
  • 53
2
votes
3 answers

Create a stage to be used across all schemas

I have created a stage in a schema called X, when I run the command "show stages" I am able to see that stage. Now when I go to the other schema "Y" and run the command "show stages", it is not showing that schema. I want to use the stage in "X".…
2
votes
1 answer

Regular expression in COPY INTO command in Snowflake

I have few CSV files in azure blob storage, and we are using COPY INTO command to load the files in snowflake table. The problem is: The file system is: container >> folder (Ex: account) >> Number of files like 2011-09.csv 2011-10.csv likewise and…
2
votes
2 answers

snowflake sql - lateral flatten function compare with sql server - Split_String()

I have task to compare the counts and see if it's > 0 snowflake sql below select 'abc' from tableA where (SELECT count(T1.value) as val from tableA, lateral flatten(input=>split(abc, ',')) T1 INNER JOIN (select count(T2.value) as val from…
2
votes
1 answer

Merge data logic for CDC snowflake

I am working on a logic for merging duplicate rows from one table(1000+ rows) into another empty table based on the product_id and extraction_date. The extraction_date is used to determine which data is the latest. In cases where extraction_date is…
2
votes
2 answers

How to setup email alerts in snowflake

Is there any way we can send email alerts from snowflake if warehouse reaches its limit or if any query is blocked. saw many videos where we can setup email framework using python and .NET but anyone tried this in snowflake ?
2
votes
1 answer

How to query columns following the same pattern in a Snowfake table or view?

I'd like to select some columns which have names following the same pattern. For instance: SELECT PRICE% FROM DB.SCHEME.VIEW From that I'm trying to select all fields from a view or table which names start with the string "PRICE", like "PRICE_US",…
rdrgtec
  • 592
  • 10
  • 26
2
votes
1 answer

Need guidance in forming a query in snowflake

SELECT id, login_id, count, case when count = 0 then 'Cat_A' WHEN count between 1 and 10 then 'Cat_B' WHEN count > 10 then 'Cat_C' WHEN count IS NULL THEN 'Cat D' END as Category FROM ( select…
Jenifer
  • 387
  • 2
  • 11
2
votes
3 answers

How to compute the mean with null treated as zero

In Snowflake how can I return the mean of a column, where null values are treated as zero. So, if we have 3 values: 1, 2, NULL, I would like to return 1, not 1.5 as returned by AVG function.
LeelaSella
  • 757
  • 3
  • 13
  • 24
2
votes
2 answers

invalid identifier while parsing json

I am compiling a dbt base model. Currently I get this error below. Line 6 looks the same as other lines above. Might a small syntax error that I could not spot. 15:40:22 Database Error in model base_datacenter_handling_unit…
x89
  • 2,798
  • 5
  • 46
  • 110
2
votes
1 answer

Calculating the balance from the previous amount based on DATE?

I have a table that has amounts transactions for each user. I am trying to query a table that also shows a 'balance' column. So in the picture above, I have sorted by DATE, now, I would like another column that calculates the balance. So in balance…
lalaland
  • 331
  • 3
  • 16