Questions tagged [snowflake-schema]

**Snowflake schema** is special case of the database Star Schema, where one or many dimension tables are normalized. This is often done for improving the performance in some cases of the Star Schema. When the dimension tables are completely normalized, the resulting structure resembles a snowflake with the fact tables in the middle.

Snowflake schema is a logical arrangement of tables in a multidimensional database in which the dimensions are expressed by more than one table. The dimension tables that relate to the facts table represent the smallest granularity of data-analysis. The other tables in the same dimension represent aggregators of dimension's individual elements. When a dimension is aggregated by several independent criteria, the diagram resembles the shape of a snowflake. Used in data warehouses.

Methods for creating Snowflake Schema out of Star Schema:

  • Partially or fully normalize one or many dimension tables, leaving other dimensions unchanged.
  • Partially or fully normalize all dimension tables.

When to use

Query performance takes the highest significance in a data warehouse and snowflaking can hamper the performance. It very useful for cases, when some attributes like demographic attributes are browsed more then the whole customer table itself. Another case is for precalculated summaries derived from a fact table. This helps not aggregating over and over again since the operation is expensive.

Advantages

Savings in the storage space. Normalized structures are having no insertion, deletion or update anomalies. Easier to update and maintain.

Disadvantages

Schemas are less intuitive, with higher complexity and abstraction. Difficult browsing through the contents for users. Additional joins - worse query performance in some cases.

Example

Having a fact table called "Sales" and dimension table called "Product". Assume there are 500000 product dimension rows. These products can fall under 10 brands. For running query for products, not indexed on brands, the query will have to search through 500000 rows to find all brands.

On the other hand, if the "Product" table partially normalized, separated on another table "Brand" the initial search query would need to go through just 10 rows on.

External links

893 questions
-2
votes
2 answers

In Snowflake which of the following objects can be cloned?

In Snowflake Which of the following objects can be cloned? A. Tables B. Named File Formats C. Schemas D. Shares E. Databases F. Users Tables, Schemas and Databases I know but can we also clone Users or Named File Format?
Nicky
  • 125
  • 1
  • 9
-2
votes
1 answer

Snowflake timestamp conversion in java 8

I am getting date from snowflake in below format 2021-07-13 00:00:00.000 -0600 What string format should i use to convert it to java 8 LocalDateTime?
-2
votes
2 answers

Resize a warehouse from x-small to medium in snowflake

Resize a warehouse from x-small to medium in snowflake is for what purpose.Kindly clarify. To accommodate more queries or to accommodate more user or to optimized workload or to optimized complex workload.
Prats
  • 11
  • 2
-2
votes
1 answer

How to use the percent key word in the snowflake query

SELECT NBR, Customers, status FROM ( SELECT NBR, Customers, Code AS status FROM CCC AS CS INNER JOIN AAA AS AC ON CCC.B2= ACT.B1 AND CSS.B2 = ACT.B1 ) AS rst WHERE status IN ('A', 'T') ORDER BY NBR LIMIT 100 PERCENT
Akhira
  • 203
  • 2
  • 5
  • 16
-3
votes
2 answers

Using SAP BODS to load data to Snowflake it is taking a lot of time

Using SAP BODS to load data to Snowflake it is taking a lot of time. BODS job is simple source (HANA table) query then Target as Snowflake table. Just to load 6 records with 10 columns it is taking more than half an hour. I analyzed where most of…
-3
votes
1 answer

Snowflake Virtual Warehouse usage consideration

When should you consider disabling auto-suspend for a Virtual Warehouse? A. When users will be using compute at different times throughout a 24/7 period B. When managing a steady workload C. When the compute must be available with no delay or lag…
Nicky
  • 125
  • 1
  • 9
-3
votes
2 answers

Snowflake transformations

We will have flat files loaded into snowflake tables in staging schema from AWS S3. Now we need to perform simple transformations like aggregations, mapping, calculation etcetc. I know we can use informatica, other tools but we really do not have…
1 2 3
59
60