Questions tagged [snowflake-schema]

**Snowflake schema** is special case of the database Star Schema, where one or many dimension tables are normalized. This is often done for improving the performance in some cases of the Star Schema. When the dimension tables are completely normalized, the resulting structure resembles a snowflake with the fact tables in the middle.

Snowflake schema is a logical arrangement of tables in a multidimensional database in which the dimensions are expressed by more than one table. The dimension tables that relate to the facts table represent the smallest granularity of data-analysis. The other tables in the same dimension represent aggregators of dimension's individual elements. When a dimension is aggregated by several independent criteria, the diagram resembles the shape of a snowflake. Used in data warehouses.

Methods for creating Snowflake Schema out of Star Schema:

  • Partially or fully normalize one or many dimension tables, leaving other dimensions unchanged.
  • Partially or fully normalize all dimension tables.

When to use

Query performance takes the highest significance in a data warehouse and snowflaking can hamper the performance. It very useful for cases, when some attributes like demographic attributes are browsed more then the whole customer table itself. Another case is for precalculated summaries derived from a fact table. This helps not aggregating over and over again since the operation is expensive.

Advantages

Savings in the storage space. Normalized structures are having no insertion, deletion or update anomalies. Easier to update and maintain.

Disadvantages

Schemas are less intuitive, with higher complexity and abstraction. Difficult browsing through the contents for users. Additional joins - worse query performance in some cases.

Example

Having a fact table called "Sales" and dimension table called "Product". Assume there are 500000 product dimension rows. These products can fall under 10 brands. For running query for products, not indexed on brands, the query will have to search through 500000 rows to find all brands.

On the other hand, if the "Product" table partially normalized, separated on another table "Brand" the initial search query would need to go through just 10 rows on.

External links

893 questions
2
votes
1 answer

Passing Variable to Snowflake Procedure

I am writing stored procedure in snowflake where i have several lines of select statements with multiple joins. So i need to use variable for each database schema. I tried experimenting with below code as an experiment but could not make it working.…
2
votes
1 answer

Can a Star Schema Contain Outriggers?

My understanding from Star Schema modelling is that dimensions must directly link to a fact table. If this is true, then a star schema can't contain Outrigger dimension table. So, is this statement true that if a schema contains Outrigger dimension,…
2
votes
1 answer

Snowflake sharing to Data Consumer

In snowflake share Data Providers can share data with only the Data Consumer and does Snowflake charges additional fees to Data providers for each share they create? Also is it possible that after sharing the Data Consumer can extend that data…
Nicky
  • 125
  • 1
  • 9
2
votes
1 answer

Snowflake automatically rounding number during COPY INTO transformation

I am using an AWS S3 stage to load .csv data into my Snowflake database. The .csv columns are as follows: My COPY INTO command is this: copy into MY_TABLE(tot_completions, tot_hov, parent_id) from (select t.$1, to_decimal(REPLACE(t.$2, ',')), 1…
2
votes
1 answer

Show grants on all schemas in a Snowflake database

I am trying to list all grants of all schemas in a specific database. I found that only SHOW GRANTS ON SCHEMA "TEST_DB"."TEST_SCHEMA" do the trick for one schema, but unfortunately I cannot be done for all schemas. I believe I would need to do a…
Will
  • 2,057
  • 1
  • 22
  • 34
2
votes
1 answer

Snowflake: JSON data with dynamic columns

Does Snowflake support realtime JSON data ingestion into tables using Kafka Connector with varying columns without predefining the schema? Also, is there any reference to the documentation/examples related to it. Thanks in advance! Regards, Kishore
kishore k
  • 147
  • 1
  • 8
2
votes
1 answer

From operational data store to warehouse

I have 3 ODS, so in order to create DWH, do I need a DWH for each ODSs, or the DWH select the tables from any of the ODSs?
2
votes
1 answer

How do I copy a CSV file into Snowflake DB using Java

I am trying to write a tool to load CSV files into multiple database. While trying to search online how to use COPY command with Snowflake, I couldn't find information how to do it in Java. This is how I do with PostgreSQL public void…
2
votes
2 answers

Getting schemas from all the available database in snowflake

I am trying to get all the schema from a database which I passed in the connection url in snowflake jdbc driver. Observation : I am getting all the schema from all the databases even though I pass wrong database. wrong database/schema/warehouse does…
2
votes
1 answer

How to change default date_output_format in Snowflake?

I went through the Snowflake docs and found that the default date_output_format in snowflake is "YYYY-MM-DD". We can change it to per session as (alter session set date_output_format = 'DD-MON-YYYY'). But I want to change the default value…
2
votes
2 answers

Overlapping effective dates aggregation

I am trying to aggregate overlapping effective dates. Any gaps between dates should be considered as separate rows. I am using min and max and I am getting below output but would like to see expected output. My query WITH test_data AS ( SELECT…
2
votes
1 answer

Snowflake - Query to check if given schema is empty i.e. no object in schema

I am trying to create single query which can tell if given snowflake schema is empty so that it can be dropped. I have few SQLs created which queries the INFORMATION_SCHEMA views - tables, views, sequences to check if any object created. In this way…
2
votes
2 answers

Snowflake merge using same table

I have a table whose Primary key is 1 and the other column is roll number. I want to check if the id value is present in table, if yes update its roll number. If not insert the id and roll number in table. I am trying to do it using…
Avenger
  • 793
  • 11
  • 31
2
votes
0 answers

Snowflake 'Transient' table shows retention period of 90 days

I created a transient schema "Employee_Schema" in which I created a normal table "Employee" create table Employee(Id Int, Name String, Salary Double) data_retention_time_in_days=90; Now because the schema was transient, the table also became…
2
votes
1 answer

Insufficient privileges to drop schema

Trying to grant a role access to drop a schema from a database in Snowflake. What grants do I need to apply? Currently if we try to execute this statement for the user: DROP SCHEMA IF EXISTS 'schemaname_123' We get this error: SQL access control…