Questions tagged [snowflake-schema]

**Snowflake schema** is special case of the database Star Schema, where one or many dimension tables are normalized. This is often done for improving the performance in some cases of the Star Schema. When the dimension tables are completely normalized, the resulting structure resembles a snowflake with the fact tables in the middle.

Snowflake schema is a logical arrangement of tables in a multidimensional database in which the dimensions are expressed by more than one table. The dimension tables that relate to the facts table represent the smallest granularity of data-analysis. The other tables in the same dimension represent aggregators of dimension's individual elements. When a dimension is aggregated by several independent criteria, the diagram resembles the shape of a snowflake. Used in data warehouses.

Methods for creating Snowflake Schema out of Star Schema:

  • Partially or fully normalize one or many dimension tables, leaving other dimensions unchanged.
  • Partially or fully normalize all dimension tables.

When to use

Query performance takes the highest significance in a data warehouse and snowflaking can hamper the performance. It very useful for cases, when some attributes like demographic attributes are browsed more then the whole customer table itself. Another case is for precalculated summaries derived from a fact table. This helps not aggregating over and over again since the operation is expensive.

Advantages

Savings in the storage space. Normalized structures are having no insertion, deletion or update anomalies. Easier to update and maintain.

Disadvantages

Schemas are less intuitive, with higher complexity and abstraction. Difficult browsing through the contents for users. Additional joins - worse query performance in some cases.

Example

Having a fact table called "Sales" and dimension table called "Product". Assume there are 500000 product dimension rows. These products can fall under 10 brands. For running query for products, not indexed on brands, the query will have to search through 500000 rows to find all brands.

On the other hand, if the "Product" table partially normalized, separated on another table "Brand" the initial search query would need to go through just 10 rows on.

External links

893 questions
0
votes
1 answer

Snowflake Timestamp column not loaded using Copy Command

I am using Copy command to load a file in table. It has a timestamp column. In my File format I have defined Timestamp as other and gave value as MM/DD/YYYY HH:MI:SS AM to match with data. When I execute it loads all records which have timestamp…
0
votes
1 answer

How to retrieve all child nodes from JSON file

I have below JSON file, which is in the external stage, I'm trying to write a copy query into the table with the below query. But it's fetching a single record from the node "values" whereas I need to insert all child elements for the values node. I…
0
votes
1 answer

Correlated Sub Queries in Snowflake

Can you please let me know how to convert the below sql server code to Snowflake, since Snowflake doesn't support correlated sub queries. SELECT DISTINCT CURRENT_DATE as creationdate, LastName, (SELECT Date AS Exp1 FROM PSU AS P WHERE (HQL =…
Akhira
  • 203
  • 2
  • 5
  • 16
0
votes
1 answer

Can you extract a dimension table from a fact table?

Here's the situation, in the source database we have more than 600K active rows for a dimension but in reality the business only uses 100 of them. Unfortunately the list of values that they might use is not known and we can't manually filter on…
Ariox66
  • 620
  • 2
  • 9
  • 29
0
votes
1 answer

Substring in Snowflake, string is the parameter value

SET Date = '050120'; select SUBSTR("+Parameters!Date.Value+", 5, 2) I am passing the Date parameter value. Why is it not picking the date value? It is showing the output as 'ra' which is shown in screen shot. I am expecting it to show 20.
Akhira
  • 203
  • 2
  • 5
  • 16
0
votes
1 answer

how do i connect local hive to snowflake?

I have gone through the steps in snowflake documentation to configure a connector between hive and snowflake. https://docs.snowflake.com/en/user-guide/tables-external-hive.html#step-1-install-the-connector But i cant see the hive tables or even a…
0
votes
2 answers

SNOWFLAKE querying the array of elements

I am using SNOW_FLAKE and trying to query the data stored in the form of array of elements under column name nested_colmn as example: nested_colmn [ { "firstKey": "val1", "secondKey": 2555, "thirdKey": false, "fourthkey":…
0
votes
2 answers

Snowflake creation of Notification integration on azure storage queue error

I was trying to create notification integration for azure storage , created storage queue. snowflake Subnet included and snowflake service principle has access to storage , Everything working fine with storage integration. now i am trying to setup…
Bala Murali
  • 105
  • 2
  • 13
0
votes
3 answers

How does not equal in snowflake exactly work?

I have a snowflake query that has a field called status. The field either contains null or 'deleted' when I do the following to get only deleted it works: select * from tbl_1 where status = 'deleted' when I try excluding all deleted it excludes…
codeBarer
  • 2,238
  • 7
  • 44
  • 75
0
votes
1 answer

I need to find the number of users that were invoiced for an amount greater than 0 in the previous month and were not invoiced in the current month

I need to find the number of users that were invoiced for an amount greater than 0 in the previous month and were not invoiced in the current month. This calcualtion is to be done for 12 months in a single query. Output should be as below. Month …
Lona Lobo
  • 11
  • 2
0
votes
1 answer

Number column to Var char column without returning decimal value

The following expression is returning decimal value TO_VARCHAR(COLUMN_NAME + 10) let say the value of column is 80 , it is ruturning 90.00000 and the COLUMN_NAME in source is number what to add or change to get the return value as varchar as 90…
0
votes
2 answers

Snowflake query slowness

Did anyone observed slowness when querying data from snowflake(select statement) with OS Windows 2016 + pycharm environment. Getting result quickly with window 10 OS with same environment. When checked network performance using wireshark, round…
0
votes
1 answer

SHOW COLUMNS syntax Issue

I am trying to find Show Columns like and referring to this : sql show columns But something wrong with Syntax: SHOW COLUMNS LIKE '%ACCOUNT_ID%' IN { ACCOUNT | [ DATABASE ] "CLIENT_DB" | [ SCHEMA ] "ACCOUNTS" }
0
votes
2 answers

Date parsing issue in the where clause in Snowflake query

I am running a Snowflake query that is intended to return records that among other things, have a date earlier than the current date. select * from table_a where id < 100 and date < ??? The problem is, the data this query is running on has issues…
0
votes
0 answers

When to Use a bridge table in a star scheme

I am struggling to know when to use bridge tables while designing star schemas. Is it safe to assume that if the attribute I am looking for is in a many to many relationship entities, then I will have to use a bridge table? For example, if we want…