The Filter stage is a processing stage. It can have a single input link and a any number of output links and, optionally, a single reject link.
Depending upon the requirements input records are filtered out unmodified. this stage is use to filter the input data according to the required condition.
All records which are not filtered out by ‘Where clause’ will get routed to the reject link.
When quoting in the filter, you should use single, not double, inverted commas.
Predicates category
Use the Properties tab to specify how the Filter stage operates on the Predicates category.
Where clause
Specify a Where statement that a row must satisfy in order to be routed down this link. This is like an SQL Where clause.
The operation of the filter stage is governed by the expressions you set in the Where property on the Properties tab. You can use the following elements to specify the expressions:
Input columns.
Requirements involving the contents of the input columns.
Optional constants to be used in comparisons.
The Boolean operators AND and OR to combine requirements.
When a record meets the requirements, it is written unchanged to the specified output link. The Where property supports standard SQL expressions, except when comparing strings.
Filter stage: Options category
Output rejects
Set this to true to output rows that satisfy no Where clauses down the reject link (remember to specify which link is the reject link on the parallel job canvas).
Output rows only once
Set this to true to specify that rows are only output down the link of the first Where clause they satisfy. Set to false to have rows output down the links of all Where clauses that they satisfy.
String comparison
InfoSphere DataStage sorts string values according to these general rules:
1. Characters are sorted in lexicographic order.
2. Strings are evaluated by their ASCII value.
3. Sorting is case sensitive, that is, uppercase letters appear before lowercase letter in sorted data.
4. Null characters appear before non-null characters in a sorted data set, unless you specify the nulls last option.
5. Byte-for-byte comparison is performed.
Suppose we have below records
ID NAME
1 Russ
2 Joe
Step 1: Design job which contain stages
sequential file -> filter -> sequential file
Step 2: Consider we need to filter out input data on below conditions.
Name=Russ
Set below property
Predicate:
Where clause = Name = “Russ”
Step 3: Compile and Run the job.
You will see below output:
ID NAME
1 Russ
What is the difference between filter and external filter stage?
The External Filter stage is a processing stage. It can have a single input link and a single output link.
The External Filter stage allows you to specify a UNIX command that acts as a filter on the data you are processing.
This stage is having only one option i.e. Filter command
Filter Stage in DataStage
Reading Time: 3 minutes