Questions tagged [snowflake-schema]

**Snowflake schema** is special case of the database Star Schema, where one or many dimension tables are normalized. This is often done for improving the performance in some cases of the Star Schema. When the dimension tables are completely normalized, the resulting structure resembles a snowflake with the fact tables in the middle.

Snowflake schema is a logical arrangement of tables in a multidimensional database in which the dimensions are expressed by more than one table. The dimension tables that relate to the facts table represent the smallest granularity of data-analysis. The other tables in the same dimension represent aggregators of dimension's individual elements. When a dimension is aggregated by several independent criteria, the diagram resembles the shape of a snowflake. Used in data warehouses.

Methods for creating Snowflake Schema out of Star Schema:

  • Partially or fully normalize one or many dimension tables, leaving other dimensions unchanged.
  • Partially or fully normalize all dimension tables.

When to use

Query performance takes the highest significance in a data warehouse and snowflaking can hamper the performance. It very useful for cases, when some attributes like demographic attributes are browsed more then the whole customer table itself. Another case is for precalculated summaries derived from a fact table. This helps not aggregating over and over again since the operation is expensive.

Advantages

Savings in the storage space. Normalized structures are having no insertion, deletion or update anomalies. Easier to update and maintain.

Disadvantages

Schemas are less intuitive, with higher complexity and abstraction. Difficult browsing through the contents for users. Additional joins - worse query performance in some cases.

Example

Having a fact table called "Sales" and dimension table called "Product". Assume there are 500000 product dimension rows. These products can fall under 10 brands. For running query for products, not indexed on brands, the query will have to search through 500000 rows to find all brands.

On the other hand, if the "Product" table partially normalized, separated on another table "Brand" the initial search query would need to go through just 10 rows on.

External links

893 questions
0
votes
2 answers

Dynamic LIKE in Where Clause statement

I've been trying my best to achieve dynamic query for the LIKE statement From the below SQL query instead of doing ilike for every value and it may grow in large. I cannot rewrite the query again and again for new table_name value .. I can store…
Kar
  • 790
  • 13
  • 36
0
votes
2 answers

Unable to load csv file into Snowflake

Iam getting the below error when I try to load CSV From my system to Snowflake table: Unable to copy files into table. Numeric value '"4' is not recognized File '@EMPP/ui1591621834308/snow.csv', line 2, character 25 Row 1, column "EMPP"["SALARY":5]…
0
votes
2 answers

Can we grant select or insert to a user on table with out creating a role in snowflake?

Can we grant direct select or insert access( with out creating a role ) to a user on a table ?
0
votes
1 answer

Concatenation issues with in-built functions

Can someone please explain me why it returns only two records instead of three? I want the same result when I hard-code the delimiter as well as use the in-built SQL function(s). SELECT 'HASH(IFNULL(COST_CENTER, '''') || IFNULL(MATCH_DATE,…
0
votes
1 answer

Work table, error table and log table in Snowflake?

What are the equivalent of the following Teradata tables in Snowflake: Work tables (WT) Error Tables(ET) UV tables- (another error table that stores data with uniqueness violations) Log Table(LT) These tables get populated with Teradata TPT, is…
Vishrant
  • 15,456
  • 11
  • 71
  • 120
0
votes
3 answers

Snowflake CASE statement using file format?

Following is a Teradata case statement that converts false, False, F, f to 0 and true, True, T, t to 1 else my_col_ value, I would like to do the same using Snowflake file format. CASE WHEN my_col ='\N' THEN NULL WHEN my_col = 'false' OR my_col =…
Vishrant
  • 15,456
  • 11
  • 71
  • 120
0
votes
1 answer

Changed Data Capture (CDC) - periodically Sync datasets between S3 Staging File & Snowflake Tables

For ex: Source : S3 Scheduled update: 2 times/Day Target : Snowflake Table Streaming : NO On Day-0, I have a customer table created & loaded in Snowflake DB. S3 Files are updated 2 times a day and they need to reflected in Snowflake table as…
0
votes
2 answers

Is there any way to build below using UDF in snowflake instead of flattening?

i have below tables table1: Payload(column) { "list": "212=1.00,214" } table 2 looks like below i want result like below using UDF instead of using flatten { "test13": { "code": "212", "desc": "success", "value": "1.00" }, …
Manu
  • 51
  • 1
  • 6
0
votes
1 answer

How to load 533 columns of data into snowflake table?

We have a table with 533 columns with a lot of LOB columns that have to be moved to snowflake. Since our source transformation system having an issue to manage 533 columns in one job. We have split ted the columns into 2 jobs. The first job will…
0
votes
1 answer

How to create a subset from a variant column in snowflake?

For example, suppose my variant column "xyz" consists of data like this: { "post_new_visits": "Repeat", "post_new1_week": "Thursday", "post_new2_appt": "Weekday", "post_new3_site": "12:50AM", "post_new4_channel": "5.0", "pre_new2_appt":…
0
votes
1 answer

Standard load (ingestion) process for snowflake DW?

For on-prem -> S3 -> Snowflake ETL Solution Currently I use tools: snowSQL snowflake UI Currently I follow this load process Create warehouse Create indexes (DDL) Create tables (DDL) Copy files into User S3 Stage. Copy S3 stage files into…
0
votes
1 answer

Snowflake integration with Jumpcloud throws http 400 error

I have enabled AWS private link to access snowflake and there is no Issue with the Link, when Integrating with SSO using Jumpcloud, after login it just throws 400 Error For Troubleshhot I have tried but they didn't work…
0
votes
1 answer

How to put files from a staged folder into a table

I uploaded a zipped folder which contained about 200 sub folders and about 3000 XML files per sub folder to an internal stage in snowflake. Im having trouble putting the XML files into a table I created. I am running the command below to try to and…
Jeff
  • 427
  • 1
  • 14
  • 31
0
votes
0 answers

Need a efficient query way to calculate sum of work days in snowflake(without using function)

I have a Date Dimension which has the date column( having dates from 2010 to 2020) along with the Isworkday bit column (which is either 0 or 1). All the holidays are calculated and pre-populated in the dimension. Now, I have a table which gives the…
0
votes
1 answer

Database change detection process in Snowflake

I'm trying to design a process to capture database change detection and then push it to GitHub. Trying to explore the free options or write a custom code. Few options that i have thought about: Create a hierarchy folder structure like…