Questions tagged [data-warehouse]

A data warehouse (DW) is a database specifically structured/designed to aid in querying, analyzing and reporting (generating reports out of) of current and historical data. DWs are central repositories of integrated data from one or more disparate sources. Basic difference between a data warehouse and a set of DB tables is how the data is organized/structured.

A data warehouse is a subject-oriented, integrated, time-variant, and non-volatile collection of data used for querying, analyzing and reporting for decision-support purposes.

Data Mart is the access layer of the data warehouse. It serves a particular department like Marketing, HR, etc. Dedicated to business function and unit specifications, data marts make the information more focused and faster to find.

Some differences between Data mart and Data warehouse:

Data Warehouses have multiple subject areas with more detailed information. They are integrating all sources of data. Dimensional modelling is not necessary, but it feeds dimensional models
Data Marts usually hold one subject area with not that detailed information - often summarized. Concentrate on integrating information from one subject area or source system. Built on dimensional models like star-schema.

There are many products readily available that provide data warehousing capability e.g. MSAccess, Essbase (Hyperion, now Oracle), Cognos, Business Objects, MicroStrategy, ...

Basics of Data Warehousing:

Dimensional Modelling - Consist of the identify the measurements, or facts, that are given the context by their related dimensions. The grain of the fact table describes the level of detail at which the facts are recorded.

Main steps of relational modelling:

Choose the business process
Declare the grain
Identify the dimensions
Identify the fact

Online Analytical Processing (OLAP) and it's types (ROLAP, MOLAP, HOLAP, ...): Describes basics of the DB designs and pros/cons of each way. - A variety of different design patterns are used in a data warehouse environment. Some common approaches include: Normalized (5NF); DataVault; Anchor Modelling; Dimensional (5,6); other temporal (e.g. 6NF). - SQL: Describes how a Data Warehouse can be queried. Following is a list of basic keywords that every data warehouse developer must know: - JOIN - GROUPBY

At a high level the Data Warehousing can be divided into:

Tools (IBM Cognos, Microsoft Business Intelligence, Oracle Business Intelligence Enterprise, dition(OBIEE), Business Objects Enterprise XI, Jaspersoft, Talend Open studio, Pentaho, Qlikview etc) readily available and how to use them. Used for small to medium sized data sets. This usually requires [at least] knowledge of tool's:
- data model and
- user interface
Building your own data warehouse for specific usecases. Used when dealing with really huge data sets (e.g. the data collected by Google, Yahoo, Facebook or a couters/performance-management-data from a large telecommunication network. This usually requires [at least] knowledge of:
- scalability, high availability and clustering concepts.
- data warehouse (schema, queries, data model, ...) design.
- available databases (Oracle, Clustra, Greenplum, MySQL, DB2, ...)
- problem domain (implicit).
- relevant GUI/UI (SWING, JSP, ...) and business logic (J2EE, C++, ...) technologies

2778 questions

votes

0 answers

What is the actual difference between Data Warehouse & Big Data?

I know what is Data Warehouse & what is Big Data. But I am confused with Data Warehouse Vs Big Data. Both are same with different names or both are different(Conceptually & Physically).

database bigdata data-warehouse

asked Sep 27 '13 at 06:05

Aditya

2,299
5
32
54

votes

2 answers

Good place to start learning data warehousing?

I am interested in learning more about data warehousing. I see terms like "dimension", "snowflake schema" and "star schema" thrown about. Where would one start in learning about this stuff? Are there good books or Internet resources? ETL is in this…

resources data-warehouse

asked Jul 10 '09 at 11:45

Jordie

votes

2 answers

Implementing roleplaying dimension with friendly attribute names in SSAS?

I have a fact table that references my date dimension as both a Form Date and Finances Date. Hence the date dimension is roleplaying two different roles. This works fine, except that I can't roleplay the names of the Year or Month column. I would…

ssas data-warehouse

asked Sep 07 '11 at 20:45

AaronLS

37,329
20
143
202

votes

3 answers

What is best practice for representing time intervals in a data warehouse?

In particular I am dealing with a Type 2 Slowly Changing Dimension and need to represent the time interval a particular record was active for, i.e. for each record I have a StartDate and an EndDate. My question is around whether to use a closed…

database database-design data-warehouse

asked Nov 24 '10 at 08:57

snth

5,194
4
39
48

votes

2 answers

Anything similar to MySQL Proxy for PostgreSQL?

I am looking for something similar to MySQL Proxy. The purpose is to modify incoming queries on the server. I am not looking for alternative ways to achieve the same. My best guess at the moment is to modify GridSQL, but this adds complexity and it…

postgresql url-rewriting data-warehouse mysql-proxy

asked Oct 14 '10 at 12:37

David

4,786
11
52
80

votes

3 answers

What's the Grain in the context of DW

According to The Data Warehouse Toolkit by Kimball "The grain must be declared before choosing dimensions or facts because every candidate dimension or fact must be consistent with the grain." I'm so confused about this concept .Could some one…

sql database oracle data-warehouse dimensional-modeling

asked Aug 27 '16 at 12:15

Anyname Donotcare

11,113
66
219
392

votes

2 answers

Data Warehousing arbitrary fields

In our application, we support user-written plugins. Those plugins generate data of various types (int, float, str, or datetime), and those data are labeled with bunches of meta-data (user, current directory, etc.) as well as three free-text fields…

mysql database database-design blob data-warehouse

asked Sep 15 '10 at 23:41

bukzor

37,539
11
77
111

votes

3 answers

Oracle aggregation function to allocate amount

Suppose I have 2 tables T1 and T2 as follows T1: bag_id bag_type capacity ------|--------|-------- 1 A 500 2 A 300 3 A 100 4 B 200 5 B 100 T2: item_type…

sql oracle oracle11g aggregate-functions data-warehouse

asked Jul 15 '15 at 15:48

asinkxcoswt

2,252
5
29
57

votes

5 answers

True or False: Good design calls for every table to have a primary key, if nothing else, a running integer

Consider a grocery store scenario (I'm making this up) where you have FACT records that represent a sale transaction, where the columns of the Fact table include SaleItemFact Table ------------------ CustomerID ProductID Price DistributorID …

database-design data-warehouse

asked Mar 06 '10 at 00:31

Chad

23,658
51
191
321

votes

1 answer

SQL/SSIS DataWareHouse Fact table loading, best practices?

I am building my first datawarehouse in SQL 2008/SSIS and I am looking for some best practices around loading the fact tables. Currently in my DW I have about 20 Dimensions (Offices, Employees, Products, Customer, etc.) that are of Type 1 SCD. In my…

sql sql-server ssis lookup data-warehouse

asked Nov 01 '12 at 01:04

user1709091

votes

7 answers

In a Data Warehouse scenario is there any disadvantage to using WITH(NOLOCK)

I have a Kimball-style DW (facts and dimensions in star models - no late-arriving facts rows or columns, no columns changing in dimensions except expiry as part of Type 2 slowly changing dimensions) with heavy daily processing to insert and update…

sql-server sql-server-2005 data-warehouse nolock

asked Jun 18 '09 at 20:22

Cade Roux

88,164
40
182
265

votes

4 answers

Handling nulls in Datawarehouse

I'd like to ask your input on what the best practice is for handling null or empty data values when it pertains to data warehousing and SSIS/SSAS. I have several fact and dimension tables that contain null values in different rows. Specifics: 1)…

sql ssis ssas data-warehouse

asked Jun 10 '09 at 20:18

rrydman

votes

2 answers

One or multiple fact tables?

I am trying to build a data mart. I have lot of dimensions, and couple of measures - facts. Every measure is connected to all dimensions in term of business. There is the standard approach that there will be one big fact table with all measures.…

database-design data-warehouse business-intelligence extensibility fact-table

asked Mar 19 '12 at 13:03

John

votes

4 answers

Why NULL values are mapped as 0 in Fact tables?

What is the reason that in measure fields in fact tables (dimensionally modeled data warehouses) NULL values are usually mapped as 0?

sql-server ssis null data-warehouse dimensional-modeling

asked Nov 28 '11 at 19:09

jrara

16,239
33
89
120

votes

1 answer

What advantages does in-memory OLAP have over traditional systems with significant memory?

Do in-memory OLAP engines have advantages over the traditional OLAP engines backed by enough RAM to contain the entire cube(s)? For example, if I use a MOLAP engine (SSAS) and GB / TB of RAM where the entire cube (or even star-schema) is RAM…

data-warehouse olap business-intelligence

asked Nov 06 '11 at 02:33

kermatt

1,585
2
16
36

Prev 1 2 3

…

99 100 Next