Questions tagged [snowflake-schema]

**Snowflake schema** is special case of the database Star Schema, where one or many dimension tables are normalized. This is often done for improving the performance in some cases of the Star Schema. When the dimension tables are completely normalized, the resulting structure resembles a snowflake with the fact tables in the middle.

Snowflake schema is a logical arrangement of tables in a multidimensional database in which the dimensions are expressed by more than one table. The dimension tables that relate to the facts table represent the smallest granularity of data-analysis. The other tables in the same dimension represent aggregators of dimension's individual elements. When a dimension is aggregated by several independent criteria, the diagram resembles the shape of a snowflake. Used in data warehouses.

Methods for creating Snowflake Schema out of Star Schema:

  • Partially or fully normalize one or many dimension tables, leaving other dimensions unchanged.
  • Partially or fully normalize all dimension tables.

When to use

Query performance takes the highest significance in a data warehouse and snowflaking can hamper the performance. It very useful for cases, when some attributes like demographic attributes are browsed more then the whole customer table itself. Another case is for precalculated summaries derived from a fact table. This helps not aggregating over and over again since the operation is expensive.

Advantages

Savings in the storage space. Normalized structures are having no insertion, deletion or update anomalies. Easier to update and maintain.

Disadvantages

Schemas are less intuitive, with higher complexity and abstraction. Difficult browsing through the contents for users. Additional joins - worse query performance in some cases.

Example

Having a fact table called "Sales" and dimension table called "Product". Assume there are 500000 product dimension rows. These products can fall under 10 brands. For running query for products, not indexed on brands, the query will have to search through 500000 rows to find all brands.

On the other hand, if the "Product" table partially normalized, separated on another table "Brand" the initial search query would need to go through just 10 rows on.

External links

893 questions
2
votes
1 answer

Snowflake role redesigning

I need to structure the roles in snowflake and need advise for it. Currently, there are two databases: basedb and computedb. basedb is the database where we load the data and the data is not transformed. more like a landing zone. computedb is where…
2
votes
3 answers

How to fill in missing dates

I have a table as follows: create table users as(sent_at date, user_email varchar(128), score int, quota int); insert into users select '2022-01-18','user@email.com',50,200 union select '2022-01-15','user@email.com',34,400; I'd like to get result…
user1552698
  • 577
  • 3
  • 9
  • 24
2
votes
1 answer

Loading data to temp table from staging table

I have loaded CSV file into staging area. Now I want to shift this file to temporary table. For that I created 3 different schemas, ie STG ( for staging area ), TMP ( for temporary area ) and TGT ( for target area or datawarehouse). This is staging…
2
votes
2 answers

CTAS to my local table using Snowflake Data Share

I am very new to Snowflake, and I have to answer a question whether I can load data into local database table using data share from different account. e.g: I have database db1 which has emp table locally and I have consumer access to share1 which…
2
votes
1 answer

Load JSON Data into Snow flake table

My Data is follows: [ { "InvestorID": "10014-49", "InvestorName": "Blackstone", "LastUpdated": "11/23/2021" }, { "InvestorID": "15713-74", "InvestorName": "Bay Grove Capital", …
Xi12
  • 939
  • 2
  • 14
  • 27
2
votes
1 answer

How to calculate difference in dates for column with smallest value

I have a table that contains 3 column dates: CREATED, CLOSED and EXPIRED. I need to calculate the difference in date between either CLOSED or EXPIRED and CREATED. I need to select the SMALLER value between CLOSED and EXPIRED and then calculate the…
lalaland
  • 331
  • 3
  • 16
2
votes
1 answer

Snowflake Error parsing JSON: unfinished string, pos

I am trying to query a varchar column called "JsonCode" from the "Weather" table in snowflake. The "JsonCode" column looks like this: { "Date": "2019-11-07T12:28:18", "CurrentTemp": "47°F", "WeatherIconStatus": "clear-day", "LowTemp":…
2
votes
3 answers

Query Review - Snowflake

I have a query in snowflake that works as expected but I feel like there's gotta be a better way of doing this, so I'm checking if anyone has a better and more efficient solution to this. I want to count how many users have SA4 and SA5. Then check…
2
votes
2 answers

Snowflake SQL Query taking too much time to run when trying to apply multiple joins

I am trying to work with a sql query on snowflake where I am trying to join multiple tables but my query is taking forever to run, I am not sure whether its my query or may be I have taken some wrong approach. I have these below tables in snowflake…
2
votes
1 answer

How to change datetime to INT in snowflake?

I have a table that contains a datetime column in datetime format, see picture for example. I need to convert this timestamp to YYYYMMDD in int, something like this: I'm trying this query, but it does not work. select Cast(Cast(Cast (DATETIME AS…
lalaland
  • 331
  • 3
  • 16
2
votes
2 answers

Can we trigger snowflake task manually or enable to run at a specific day and time

Can we trigger snowflake task manually or enable to run at a specific day and time? if the above is not feasible. Any other alternative ideas?
ADom
  • 151
  • 1
  • 9
2
votes
1 answer

Can we parametrize snowflake tasks?

I need to do one-time historical data load, followed by incremental load every 10 minutes. is there a way to parametrize snowflake task to 1st run the historical load and then change the parameter to execute incremental loads? if not, can you…
ADom
  • 151
  • 1
  • 9
2
votes
1 answer

Enable change tracking on all tables in snowflake cloud data warehouse

How to enable change tracking = true on all tables in a schema in snowflake data warehouse.
Sri
  • 75
  • 6
2
votes
0 answers

In snowflake, is there way re-share like a symbolic link?

I currently have data sources that are being consumed by a lot of data consumers. I am now given a new set of data sources to replace existing ones. But I want this new sources to be named like previous data sources but pointed to new data sources.…
davidm
  • 91
  • 1
  • 5
2
votes
2 answers

What is the TTL for a Result set when Asynchronous Query is executed in Snowflake?

Referring to this Documentation: https://docs.snowflake.com/en/user-guide/jdbc-using.html#performing-an-asynchronous-query We are Executing Long-Running SELECT Queries in Snowflake Asynchronously and Poll the execution with query ID every 5 sec to…