Datastage performance considerations

Reading Time: 2 minutes
COLUMNS and TYPE CONVERSIONS
Remove unneeded columns as early as possible within the job flow
when reading from databases, use a select list to read only the columns required, rather than the entire table
Avoid propagation of unnecessary metadata between the stages. Use the Modify stage and drop the metadata
Avoid that DataStage needs to perform unnecessary type conversions in Transformations as it will use time and resources for these conversions
Use Copy stage instead of Transformer if you are just doing simple copy. If you require any data transformation, then go for transformer
TRANSFORMER
Should try to minimize the stage variables in a Transformer stage
If Transformer has so much complex code and it requires lot of resources, put it in separate job.
SORTING
if DB is not sorted, sort the data when you are reading the DB itself rather than doing from the input file.
If data has already been partitioned and sorted on a set of key columns, specify the ″don’t sort, previously sorted″ option for the key columns in the Sort stage
The performance of individual sorts can be improved by increasing the memory usage per partition using the Restrict Memory Usage (MB) option of the Sort stage
The stable sort option is much more expensive than non-stable sorts, and should only be used if there is a need to maintain row order 
SEQUENTIAL FILES
Don’t use the sequential files. Instead Dataset stages should be used for intermediate storage between different jobs.
make sure unnecessary column propogation is not done
use Join instead of lookup in case of big files for referencing