Overview of ETL

Reading Time: 15 minutes

Extract, transform and load (ETL) is a process in database usage and especially in data warehousing that involves: * Extracting data from outside sources * Transforming it to fit operational needs (which can include quality levels) * Loading it into the end target (database or data warehouse) Extract The first part of an ETL process… Continue reading Overview of ETL

What is Dimensional Model in Data Warehouse?

Reading Time: 5 minutes

What is Dimensional Model?A dimensional model is a data structure technique optimized for Data warehousing tools. The concept of Dimensional Modelling was developed by Ralph Kimball and is comprised of “fact” and “dimension” tables. A Dimensional model is designed to read, summarize, analyze numeric information like values, balances, counts, weights, etc. in a data warehouse. In contrast, relation models are optimized for addition, updating and deletion of data in a real-time Online Transaction System. These dimensional and relational models have their unique way of data storage that has specific advantages. For instance, in the relational mode, normalization and ER models reduce redundancy in data. On the contrary, dimensional model arranges data in such a way that it is easier to retrieve information and generate reports. Hence, Dimensional models are used in data warehouse systems and not a good fit for relational systems. In this tutorial, you will learn- What is Dimensional Model? Elements of Dimensional Data Model Fact Dimension Attributes Fact Table Dimension table Steps of Dimensional Modelling Step 1) Identify the business process Step 2) Identify the grain Step 3) Identify the dimensions Step 4) Identify the Fact Step 5) Build Schema Rules for Dimensional Modelling Benefits of dimensional modeling Elements of Dimensional Data ModelFactFacts are the measurements/metrics or facts from your business process. For a Sales business process, a measurement would be quarterly sales number DimensionDimension provides the context surrounding a business process event. In simple terms, they give who, what, where of a fact. In the Sales business process, for the fact quarterly sales number, dimensions would be Who – Customer NamesWhere – LocationWhat – Product NameIn other words, a dimension is a window to view information in the facts. AttributesThe Attributes are the various characteristics of the dimension. In the Location dimension, the attributes can be StateCountryZipcode etc.Attributes are used to search, filter, or classify facts. Dimension Tables contain Attributes Fact TableA fact table is a primary table in a dimensional model. A Fact Table contains Measurements/factsForeign key to dimension table Dimension table A dimension table contains dimensions of a fact. They are joined to fact table via a foreign key. Dimension tables are de-normalized tables. The Dimension Attributes are the various columns in a dimension tableDimensions offers descriptive characteristics of the facts with the help of their attributesNo set limit set for given for number of dimensions The dimension can also contain one or more hierarchical relationships Steps of Dimensional ModellingThe accuracy in creating your Dimensional modeling determines the success of your data warehouse implementation. Here are the steps to create Dimension Model Identify Business ProcessIdentify Grain (level of detail)Identify DimensionsIdentify FactsBuild StarThe model should describe the Why, How much, When/Where/Who and What of your business process Step 1) Identify the business processIdentifying the actual business process a datarehouse should cover. This could be Marketing, Sales, HR, etc. as per the data analysis needs of the organization. The selection of the Business process also depends on the quality of data available for that process. It is the most important step of the Data Modelling process, and a failure here would have cascading and irreparable defects. To describe the business process, you can use plain text or use basic Business Process Modelling Notation (BPMN) or Unified Modelling Language (UML). Step 2) Identify the grainThe Grain describes the level of detail for the business problem/solution. It is the process of identifying the lowest level of information for any table in your data warehouse. If a table contains sales data for every day, then it should be daily granularity. If a table contains total sales data for each month, then it has monthly granularity. During this stage, you answer questions like Do we need to store all the available products or just a few types of products? This decision is based on the business processes selected for DatawarehouseDo we store the product sale information on a monthly, weekly, daily or hourly basis? This decision depends on the nature of reports requested by executivesHow do the above two choices affect the database size?Example of Grain: The CEO at an MNC wants to find the sales for specific products in different locations on a daily basis. So, the grain is “product sale information by location by the day.” Step 3) Identify the dimensionsDimensions are nouns like date, store, inventory, etc. These dimensions are where all the data should be stored. For example, the date dimension may contain data like a year, month and weekday. Example of Dimensions: The CEO at an MNC wants to find the sales for specific products in different locations on a daily basis. Dimensions: Product, Location and Time Attributes: For Product: Product key (Foreign Key), Name, Type, Specifications Hierarchies: For Location: Country, State, City, Street Address, Name Step 4) Identify the FactThis step is co-associated with the business users of the system because this is where they get access to data stored in the data warehouse. Most of the fact table rows are numerical values like price or cost per unit, etc. Example of Facts: The CEO at an MNC wants to find the sales for specific products in different locations on a daily basis. The fact here is Sum of Sales by product by location by time. Step 5) Build SchemaIn this step, you implement the Dimension Model. A schema is nothing but the database structure (arrangement of tables). There are two popular schemas Star SchemaThe star schema architecture is easy to design. It is called a star schema because diagram resembles a star, with points radiating from a center. The center of the star consists of the fact table, and the points of the star is dimension tables.The fact tables in a star schema which is third normal form whereas dimensional tables are de-normalized.

How to find Netezza model?

Reading Time: 2 minutes

Find Netezza Model To find if Netezza system is a Mustang box, TwinFin or Striper use the nz_get_model command by connecting to the server using putty or query the system table: 1) Run Netezza provided support script nz_get_model eg. Below was run on Mustang 10100 model /export/home/nz->nz_get_model10100SL eg. Below was run on TwinFin 12 (newer… Continue reading How to find Netezza model?

How to check Netezza versions history?

Reading Time: < 1 minute

To find the history of NPS upgrade on your server, run below command: $more /nz/.versions/versionhistory 2017-08-22 15:17:28 -0400 (EDT)| ACTION=upgrade ORIGIN=6.0.5-0.F-1.P-10.Bld-24823 TARGET=7.0.2-0.F-1.P-5.Bld-31068 STATE=Successful 2018-02-15 22:35:36 -0500 (EST)| ACTION=upgrade ORIGIN=7.0.2-0.F-1.P-5.Bld-31068 TARGET=7.0.2-0.F-1.P-8.Bld-32960 STATE=Successful 2018-07-22 03:42:59 -0400 (EDT)| ACTION=upgrade ORIGIN=7.0.2-0.F-1.P-8.Bld-32960 TARGET= STATE=Successful What to read next? Nothing to see here. Consider joining one of our full courses..