Difference Between Fact Table and Dimension Table

Reading Time: 9 minutes

Fact Table: A fact table is a primary table in a dimensional model. A Fact Table contains Measurements/factsForeign key to dimension table Dimension table: A dimension table contains dimensions of a fact. They are joined to fact table via a foreign key. Dimension tables are de-normalized tables. The Dimension Attributes are the various columns in a dimension tableDimensions offers descriptive characteristics of the facts with the help of their attributesNo set limit set for given for number of dimensions The dimension can also contain one or more hierarchical relationships Difference between Dimension table vs. Fact tableParameters Fact Table Dimension Table Definition Measurements, metrics or facts about a business process. Companion table to the fact table contains descriptive attributes to be used as query constraining. Characteristic Located at the center of a star or snowflake schema and surrounded by dimensions. Connected to the fact table and located at the edges of the star or snowflake schema Design Defined by their grain or its most atomic level. Should be wordy, descriptive, complete, and quality assured. Task Fact table is a measurable event for which dimension table data is collected and is used for analysis and reporting. Collection of reference information about a business. Type of Data Facts tables could contain information like sales against a set of dimensions like Product and Date. Evert dimension table contains attributes which describe the details of the dimension. E.g., Product dimensions can contain Product ID, Product Category, etc. Key Primary Key in fact is mapped as foreign keys to Dimensions. Foreign key to the facts table Storage Helps to store report labels and filter domain values in dimension tables. Load detailed atomic data into dimensional structures. Hierarchy Does not contain Hierarchy Contains Hierarchies. For example Location could contain, country, pin code, state, city, etc. Type of factsType of facts Explanation Additive Measures should be added to all dimensions. Semi-Additive In this type of facts, measures may be added to some dimensions and not with others. Non-Additive It stores some basic unit of measurement of a business process. Some real-world examples include sales, phone calls, and orders. Types of Dimensions:Types of Dimension Definition Conformed Dimensions Conformed dimensions is the very fact to which it relates. This dimension is used in more than one-star schema or Datamart. Outrigger Dimensions A dimension may have a reference to another dimension table. These secondary dimensions called outrigger dimensions. This kind of Dimensions should be used carefully. Shrunken Rollup Dimensions Shrunken Rollup dimensions are a subdivision of rows and columns of a base dimension. These kinds of dimensions are useful for developing aggregated fact tables. Dimension-to-Dimension Table Joins Dimensions may have references to other dimensions. However, these relationships can be modeled with outrigger dimensions. Role-Playing Dimensions A single physical dimension helps to reference multiple times in a fact table as each reference linking to a logically distinct role for the dimension. Junk Dimensions It a collection of random transactional codes, flags or text attributes. It may not logically belong to any specific dimension. Degenerate Dimensions Degenerate dimension is without corresponding dimension. It is used in the transaction and collecting snapshot fact tables. This kind of dimension does not have its dimension as it is derived from the fact table. Swappable Dimensions They are used when the same fact table is paired with different versions of the same dimension. Step Dimensions Sequential processes, like web page events, mostly have a separate row in a fact table for every step in a process. It tells where the specific step should be used in the overall session.  

Datastage Expressions

Reading Time: < 1 minute

You can enter expressions at various places in a job sequence to set values. Literal strings enclosed in double-quotes or single-quotes. Numeric constants (integer and floating point). The sequence’s own job parameters. Prior activity variables (for example, job exit status). All built-in BASIC functions as available in a server job. Certain macros and constants as… Continue reading Datastage Expressions

What is IBM WebSphere DataStage?

Reading Time: 2 minutes

DataStage is a product from IBM, used as the strategic ETL tool within many organizations. It can be used for multiple purposes: Interfacing between multiple databases. Changing of data from one format to another. Example: From database to flat files, XML files, etc. Fast access to data that doesn’t change often Interacts with WebSphere MQ… Continue reading What is IBM WebSphere DataStage?

Datastage Triggers

Reading Time: < 1 minute

There are three types of trigger:   Conditional. A conditional trigger fires the target activity if the source activity fulfills the specified condition. The condition is defined by an expression, and can be one of the following types: OK. Activity succeeds. Failed. Activity fails. Warnings. Activity produced warnings. ReturnValue. A routine or command has returned… Continue reading Datastage Triggers

Datastage Activity stages

Reading Time: 2 minutes

Activity stages The job sequence supports the following types of activity: Job. Specifies a server or parallel job. Routine. Specifies a routine. This routine can be any routine in the InfoSphere® DataStage® Repository (but not transforms). ExecCommand. Specifies an operating system command to execute. Email Notification. Specifies that an email notification is sent at this… Continue reading Datastage Activity stages

Data Modelling with Erwin

Reading Time: 3 minutes

Definition: Data modeling is a process used to define and analyze data requirements needed to support the business processes. The process of data modeling involves professional data modelers working with business stakeholders, as well as potential users of the information system. Types: Conceptual data model: It is a set of technology independent specifications about the… Continue reading Data Modelling with Erwin

Datastage – CFF stage

Reading Time: < 1 minute

CFF stage or complex flat file stage Complex Flat File Decimal – S9(6)V99 COMP-3 DISPLAY_NUMERIC – PIC 9(1) CHARACTER – PIC X(1) FLOAT – COMP-2. BINARY  – COMP. What to read next? Nothing to see here. Consider joining one of our full courses..

Datastage Active vs Passive stage

Reading Time: < 1 minute

Active Stage Passive Stage Active stages “do” stuff -> change data, add columns, filter rows, summarize rows, etc Passive stages read/write data -> files, datasets, tables Active vs Passive stage What to read next? Nothing to see here. Consider joining one of our full courses..