When working with data, it's crucial to define the data you are handling. This definition is achieved by importing or creating table definitions, which can be used in your job designs.
Table definitions serve as the foundation for your DataStage project, specifying the data to be utilized at each stage of a job. These definitions are stored in the repository and are shared among all projects. A minimum of table definitions should be available for each data source and one for each data target in the data warehouse.
When developing a job, you'll usually populate your stages with column definitions from table definitions found in the repository. This is done on the Columns tab of the stage editor. If you select specific options in the Grid Properties dialog box, the Columns tab will display two additional fields: Table Definition Reference and Column Definition Reference. These show the table definition and individual columns that the columns on the tab were derived from.
You can import, create, or edit a table definition using the Designer.
The simplest way to specify a table definition is by directly importing it from the source or target database.
A new table definition will be created, and its properties will automatically fill in with the details of your data source or data target.
Table definitions can be imported from various sources, including:
To import table definitions:
In DataStage, Schemas offer an alternative method for specifying column definitions for data used by parallel jobs.
Schema format:
The following schema format is utilized to read a fixed width file:
//Schema FileΒ is used to read Input data with out specifyinh metadata in the Sequential File stage record{final_delim=end,delim=none}(CUSTOMER_SSN:STRING[11];CUSTOMER_NAME:STRING[30];CUSTOMER_CITY::STRING[40];CUSTOMER_ZIPCODE:STRING[10];)
Now we use the sequential file stage to read data from the datafile using schemafile.
To utilize schema file concept, enable runtime propagation in the job properties.
In the sequential file stage, add the Schema File option and provide the schema file name including path as demonstrated below.