Netezza Clustered Base Tables (CBT)

Reading Time: 4 minutes

A Netezza clustered base tables (CBT) is an table that has data which is organized using organizing keys columns (Maximum of 4). Those columns should not be a part of distribute on clause.

An organizing key is a column of the table that you specify for clustering the table records; organizing table helps Netezza to save records in same or nearby extents. You can organize the records using “ORGANIZE ON” clause. Netezza does create zone maps on organizing columns, which will accelerate the performance of queries on that particular table.

When to use Netezza Clustered Base Tables (CBT)?

CBT’s are most commonly used for fact table which has billions of records and when they are queried by a set of columns commonly. For example, if your analysts typically query the Sales table based on customer ID, date and department, these can be part of the ORGANIZE ON clause.

How Netezza Clustered Base Tables (CBT) Works?

In its unorganized form, data is organized by date/time of transaction. If you query sales table by restricting results on date/time, those query will perform well because organization matches the query restriction.

If you use the transaction type in restricting query, then you can improve the performance of query by organizing data on transaction type column. Netezza groups those transactions types in same or nearby extents.

 

Benefits of CBT

  • CBTs improve query performance by adding more zone maps for a table because the organizing key columns are also zone mapped.
  • CBTs support “multi-dimension” lookups where you can organize records by one, two, three, or even four lookup keys. This is almost like a sort without a Major and Minor precedence.
  • CBTs increase the supported data types for zone-mapped columns, thus allowing you to improve performance for queries that didn’t map certain columns that were not zone map eligible.
  • CBTs allow you to incrementally organize data within your user tables in situations where data cannot easily be accumulated in staging areas for pre- ordering before insertions or data loads. CBTs can help you to eliminate or reduce pre-sorting of new table records before a load or insert operation.
  • CBTs save disk space. Unlike indexes or materialized views, CBTs do not replicate the base table data and do not allocate additional data structures.

Selecting organizing key for Netezza Clustered Base Tables (CBT)

Netezza allows you to select one-to-four columns as organizing key. We will not use all four columns as organizing key columns, at most three key columns are used.

  • Review the column and data type of the very large table. You should also verify the type of queries that run against those tables.
  • If you typically run queries on one dimensional i.e. if you use one column, say, date to restrict data then you can load data on date column
  • If you run queries on two dimension i.e. use two columns, say ID and pin/zip to restrict data then CBT can do trick for you to increase performance
  • The organizing key must be columns that can be referenced in zone maps

Reorganizing the Table data and Copy Netezza Clustered Base Tables (CBT)

Netezza reorganize the data on specified columns when you invoke GROOM TABLE command. Groom command reorganizes data in each slice. User can perform SELECT, UPDATE, DELETE and INSERT during grooming operation.

If you copy the CBT’s by using command like CREATE TABLE AS, new table will not get organizing key columns. Instead, you should use ALTER TABLE ORGANIZE ON command to add organizing keys.

Distribution Key