Netezza’s AMPP architecture is a two-tiered system designed to handle very large queries from multiple users. The first tier is a high-performance Linux SMP host. A second host is available for fully redundant, dual-host configurations.
The host compiles queries received from applications, and generates query execution plans.
It then divides a query into a sequence of sub-tasks, or snippets, that can be executed in parallel, and distributes the snippets to the second tier for execution. The host returns the final results to the requesting application.
The second tier consists of dozens to hundreds or thousands of Snippet Processing Units (SPUs) operating in parallel.
Each SPU is an intelligent query processing and storage node, and consists of a powerful commodity processor, dedicated memory, a disk drive and a field-programmable disk controller with hard-wired logic to manage data flows and process queries at the disk level.
The massively parallel, shared-nothing SPU blades provide the performance advantage of MPP.
Nearly all query processing is done at the SPU level, with each SPU operating on its portion of the database.
All operations that lend themselves easily to parallel processing including: record operations, parsing, filtering, projecting, interlocking and logging, are performed by the SPU nodes, significantly reducing the amount of data required to be moved within the system.
Operations on sets of intermediate results, such as sorts, joins and aggregates, are executed primarily on the SPUs, but can also be done on the host, depending on the processing cost and complexity of that operation.
What to read next?
Nothing to see here. Consider joining one of our full courses..