Questions tagged [snowflake-schema]

**Snowflake schema** is special case of the database Star Schema, where one or many dimension tables are normalized. This is often done for improving the performance in some cases of the Star Schema. When the dimension tables are completely normalized, the resulting structure resembles a snowflake with the fact tables in the middle.

Snowflake schema is a logical arrangement of tables in a multidimensional database in which the dimensions are expressed by more than one table. The dimension tables that relate to the facts table represent the smallest granularity of data-analysis. The other tables in the same dimension represent aggregators of dimension's individual elements. When a dimension is aggregated by several independent criteria, the diagram resembles the shape of a snowflake. Used in data warehouses.

Methods for creating Snowflake Schema out of Star Schema:

  • Partially or fully normalize one or many dimension tables, leaving other dimensions unchanged.
  • Partially or fully normalize all dimension tables.

When to use

Query performance takes the highest significance in a data warehouse and snowflaking can hamper the performance. It very useful for cases, when some attributes like demographic attributes are browsed more then the whole customer table itself. Another case is for precalculated summaries derived from a fact table. This helps not aggregating over and over again since the operation is expensive.

Advantages

Savings in the storage space. Normalized structures are having no insertion, deletion or update anomalies. Easier to update and maintain.

Disadvantages

Schemas are less intuitive, with higher complexity and abstraction. Difficult browsing through the contents for users. Additional joins - worse query performance in some cases.

Example

Having a fact table called "Sales" and dimension table called "Product". Assume there are 500000 product dimension rows. These products can fall under 10 brands. For running query for products, not indexed on brands, the query will have to search through 500000 rows to find all brands.

On the other hand, if the "Product" table partially normalized, separated on another table "Brand" the initial search query would need to go through just 10 rows on.

External links

893 questions
0
votes
1 answer

Can we Encrypt Stored procedures in snowflakes similar to what we do in SQL Server?

Can we encrypt stored procedures in Snowflake similar to what we do in SQL Server? For example, similar to CREATE PROCEDURE #EncryptSP WITH ENCRYPTION AS QUERY HERE GO
0
votes
1 answer

Snowflake Cloud data warehouse - Database best practice limitations

Is there a limitation on how many schemas and security objects you can have in one Snowflake database and will there be performance degradation with thousands of these objects? Will splitting the data into multiple databases help performance?
GeV 126
  • 351
  • 1
  • 3
  • 14
0
votes
1 answer

Retrieve value from Snowflake JSON column

i have a json column with each record as [ { "DATE": 20190910 "NAME": "Tom" }, { "DATE": 20190504 "NAME": "Bob" } ] The above is one record in the column. I need to retrieve the record as Tom,Bob in a single record…
0
votes
1 answer

Snowflake Not Accepting File Format In Bulk Load

I am creating some new ETL tasks for our data pipeline. We have currently have several hundred loading data from various S3 buckets. So it would go like this: create or replace stage ETL_STAGE url='s3://bucketname/' file_format = csv_etl; create…
Randy B.
  • 453
  • 4
  • 20
0
votes
1 answer

Snowflake pass dynamic value binding and constant value

I have to pass the dynamic value as my column name to my table from the loop and the constant value . My code snippet something looks like below: While(res.next()){ var columnname= res.getColumnValue(1); var stmt= insert into…
0
votes
0 answers

Is it possible to load and index genomic data (Sequence Reads / VCF output etc.,) on Snowflake?

I am looking for possibilities to load sequence data and VCF output data on Snowflake. The Snowflake website does indicate that it is possible to transform VCF format to semi-structured format like JSON or CSV before loading them but does not give…
0
votes
1 answer

incremental scan: Scan only the data which is not processed?

I have a table AB which is updated as on 20191206 and the same table updated as on 20191209. I am working on another table CD where i need to scan AB and load into CD. So, i scanned AB on 20191206, it retrieved 2 records and loaded into CD. If i…
0
votes
2 answers

Is there any function or way to retrieve table names from Snowflake in the order of referential integrity (FK) dependencies?

I would like to retrieve table names from a given schema in the sorted order based on their foreign key dependencies. For example, if I have three following tables created in Snowflake CREATE TABLE TAB_X ( COL_A CHAR(18), COL_B CHAR(18), …
0
votes
1 answer

How to pass dynamic values into snowflake query?

I have a query to find potential SSN in a table using regex pattern. db_name.schema_name.Table name: db_name.schema_name.ABC Column name with Sensitive data: senstve_col select regexp_substr(senstve_col, '\\b[0-9]{3}[ -][0-9]{2}[ -]{4}\\b') as…
0
votes
2 answers

Set file name for unloaded file from Snowflake

I am unloading snowflake data into external AWS S3 stg using the below command, copy into '@ext_stg/path/file_name' from schema.table file_format = (type=csv field_delimiter= '~' compression='gzip' null_if=('','NULL', 'null',' ')…
0
votes
1 answer

Can i get database name with respect to schema name in snowflake

Hi i am trying to find the names of the database by specfying the name of the schema in snowflake
0
votes
0 answers
0
votes
1 answer

How to load .jsonl into a snowflake table variant?

How to load .jsonl into a table variant as json of snowflake create or replace table sampleColors (v variant); insert into sampleColors select parse_json(column1) as v from values ( '{r:255,g:12,b:0} {r:0,g:255,b:0}…
0
votes
1 answer

Call Snowflake Procedure from Azure Function App

I have procedure in Snowflake and would like to call it from my Timer Triggered Azure Function App. That procedure expects a parameter which is of type string. Following is my code snippet to connect to Snowflake and calling that procedure with…
0
votes
0 answers

Can Snowflake import meta data of 3rd party ETL platform and integrate with it's own to show end-to-end lineage?

I am working with a client that wants to import meta data into Snowflake, and I want to know if my 3rd party tool will 1) work 2) align with the data strategy of the existing architecture. I collect data about users that browse my manufacturing…