What is Dimensional Model in Data Warehouse? - Data Warehousing Data Warehousing Reading Time: 5 minutes we will cover these topics: hide 1) What is Dimensional Model? 1.1) Fact 1.2) Dimension 1.3) Attributes 2) Steps of Dimensional Modelling 2.1) Step 1: Understanding the Business Problem 2.2) Step 2: Identify the grain 2.3) Step 3: Identify the dimensions 2.4) Step 4: Identify the Fact 2.5) Step 5: Build Schema 3) Dimensional Modelling Rules 4) Benefits of dimensional modeling 4.1) What to read next?

What is Dimensional Model?

A dimensional model is a data structure technique optimized for Data warehousing tools. Dimensional modelling is used in Data Warehouse to organize data effectively and assist analytical operations on huge volumes of data. The concept of Dimensional Modelling was developed by Ralph Kimball and is comprised of “fact” and “dimension” tables. 

Data Warehouse uses denormalized tables (flat tables). Whereas, operational database or OLTP systems use Normalized tables. 

A Dimensional model is designed to read, summarize, analyze numeric information like values, balances, counts, weights, etc. in a data warehouse. In contrast, relation models are optimized for addition, updating and deletion of data in a real-time Online Transaction System.

The relational mode, normalization and ER models reduce redundancy in data. On the contrary, dimensional model arranges data in such a way that it is easier to retrieve information and generate reports.

Hence, Dimensional models are used in data warehouse /OLAP and not a good fit for OLTP or operational databases.

Fact

Fact table consists of the measurements, metrics or facts of a business process. It is located at the center of a star schema or a snowflake schema surrounded by dimension tables

Eg. Monthly sales volume, Average Customer Balance etc…

A Fact Table contains

  1. Measurements/facts
  2. Foreign key to dimension table

Dimension

A category of information. For example, the time dimension. In simple terms, they give who, what, where of a fact. In the Sales business process, for the fact quarterly sales number, dimensions would be

Dimension table properties:

Attributes

The Attributes are the various characteristics of the dimension.In the Location dimension, the attributes can be

Attributes are used to search, filter, or classify facts. Dimension Tables contain Attributes

Steps of Dimensional Modelling

A good dimensional model determines the success of your data warehouse implementationHere are the steps in dimenstional data modelling

  1. Understanding the Business Problem
  2. Identify Grain (level of detail)
  3. Identify Dimensions
  4. Identify Facts
  5. Build Star

Step 1: Understanding the Business Problem

Identifying the actual business process a datawarehouse should cover. This could be Marketing, Sales, HR, etc. as per the data analysis needs of the organization. It is the most important step of the Data Modelling process, and a failure here would have cascading and irreparable defects

Step 2: Identify the grain

The Grain describes the level of detail for the business problem/solution. It is the process of identifying the lowest level of information for any table in your data warehouse. If a table contains sales data for every day, then it should be daily granularity. If a table contains total sales data for each month, then it has monthly granularity. During this stage, you answer questions like

Step 3: Identify the dimensions

Dimensions are nouns like date, store, inventory, etc. These dimensions are where all the data should be stored. For example, the date dimension may contain data like a year, month and weekdayExample of Dimensions:

The CEO at an MNC wants to find the sales for specific products in different locations on a daily basis

Dimensions: Product, Location and Time Attributes: For Product: Product key (Foreign Key), Name, Type, Specifications Hierarchies: For Location: Country, State, City, Street Address, Name

Step 4: Identify the Fact

This step is co-associated with the business users of the system because this is where they get access to data stored in the data warehouse. Most of the fact table rows are numerical values like price or cost per unit, etc.Example of Facts:

The CEO at an MNC wants to find the sales for specific products in different locations on a daily basis

The fact here is Sum of Sales by product by location by time

Step 5: Build Schema

The Database schema is designed and finalized in this step. There are two popular schemasStar Schema

The star schema architecture is easy to design. It is called a star schema because diagram resembles a star, with points radiating from a center. The center of the star consists of the fact table, and the points of the star is dimension tablesThe fact tables in a star schema which is third normal form whereas dimensional tables are de-normalizedSnowflake Schema

The snowflake schema is an extension of the star schema. In a snowflake schema, each dimension are normalized and connected to more dimension tables

Dimensional Modelling Rules

Benefits of dimensional modeling



Meet Ananth Tirumanur. Hi there 👋

I work on projects in data science, big data, data engineering, data modeling, software engineering, and system design.

Connect with me:

My Resources:

Languages and Tools:

AWS, Bash, Docker, Elasticsearch, Git, Grafana, Hadoop, Hive, EMR, Glue, Athena, Lambda, Step Functions, Airflow/MWAA, DynamoDB, Kafka, Kubernetes, Linux, MariaDB, MySQL, Pandas, PostgreSQL, Python, Redis, Scala, SQLite