Datastage Activity stages

Reading Time: 2 minutes
Activity stages

The job sequence supports the following types of activity:

Job. Specifies a server or parallel job.

Routine. Specifies a routine. This routine can be any routine in the InfoSphere® DataStage® Repository (but not transforms).

ExecCommand. Specifies an operating system command to execute.

Email Notification. Specifies that an email notification is sent at this point of the sequence (uses SMTP).

Wait-for-file. Waits for a specified file to appear or disappear.

Exception Handler. There can only be one Exception Handler in a job sequence.The Exception Handler is executed if a job in the sequence fails to run (other exceptions are handled by triggers) or if a job aborts and the Automatically handle activities that fail option is set for the sequence. Use a Terminator stage to ensure that the sequence is stopped cleanly if certain situations arise.

Nested Conditions. Allows you to branch the execution of a sequence depending on a condition.

Sequencer. Allows you to synchronize the control flow of multiple activities in a job sequence.

Terminator. Allows you to specify that, if certain situations occur, the jobs being run by a sequence are shut down cleanly.

Start Loop and End Loop. Use these two stages together to implement a For…Next or a For…Each loop within your sequence.

User Variable. Allows you to define variables within a sequence. These variables can then be used later on in the sequence, for example to set job parameters.

Datastage – Grid vs Cluster

Reading Time: < 1 minute
GridCluster
A Grid uses software to auto-gen the APT file to use a subset of resources based on compute node use. A Cluster is a group of Datastage servers where The admin/designer designs the APT file to use the compute node resources available
Under the GRID software, it will look at the utilisation of the compute nodes on the grid and pick which machines are underused and dispatch the job to these nodes. If you have a group of 25 compute nodes, the designer would write his APT file to use 4 compute nodes. If those nodes are busy, Datastage would still try and use those nodes

Join vs Merge vs Lookup – Datastage

Reading Time: < 1 minute
JoinMergeLookup
We can perform four types of Joins in Join Stage. 
Inner Join
Left Outer Join
Right Outer Join
Full Outer Join
We can perform only two types of Joins in Merge Stage.
Inner Join
Left Outer Join
We can perform only two types of Joins in Look Stage.
Inner Join
And Left Outer Join
There will be a N-Inputs ( In the case of Left, Inner, Right Outer Joins)
There will be a 2 Inputs ( In the case of Full Outer Join) 
There will be a N-InputsThere will be a N-Inputs ( In the Case of Normal Stage)
2 Inputs (In the Case of Sparse Look Up ) 
One Output LinkOne Output LinkOne Output Link
No Reject LinksN-1 Reject LinksOne Reject Link
Light Memory UsageLight Memory UsageHeavy Memory usage 
Key Column Names should be Same. That is Primary record should be same with Secondary Records.Key Column Names should be Same. That is Primary record should be same with Secondary Records.Key column names Optional.
It should be same in the case of Sparse Look Up
Primary and secondary records should be in sorted order before this stagePrimary and secondary records should be in sorted order before this stageIn Look Up stage it is Optional. That is all the primary and secondary records no need to be sorted.

Type conversion function in Datastage

Reading Time: 3 minutes
Char(65)Generates an ASCII character from its numeric code value. You can optionally specify the allow8bits argument to convert 8-bit ASCII values
The following example outputs the ASCII code 65 as the character A.
Seq(“A”)Generates a numeric code value from an ASCII character. You can optionally specify the allow8bits argument to convert 8-bit ASCII values.
The following example outputs the character A as the ASCII code 65. 
DateToString(mylink.mydate, “%dd:%mm:%yyyy”)The following example outputs the date contained in the column mylink.mydate to a string with the format dd:mm:yyyy. If mylink.mydate contained the date 18th August, 2009, then the output string would be “18:08:2009”:
StringToDate(mylink.mystring,”%dd:%mm:%yyyy”)If the column mylink.mystring contains the string ″18:08:1958″, then the following function returns the date 1958–08–18. 
DateToDecimal (mylink.basedate, “%dd%mm%yyyy”)If the column mylink.basedate contains the date 2012-08-18, then the following function stores the date as the decimal number 18082012:
DecimalToDate (mylink.basedate, “%dd%mm%yyyy”)If the column mylink.mydecdata contains the value 18082012, then the following function returns the date 2012-08-18:
DecimalToDecimal(mylink.mydec,”ceil”) If the column mylink.mydec contains the decimal number 2.5345, the following function returns the decimal number 00000002.54
DecimalToString(mylink.mydec,”suppress_zero”)If the column mylink.mydec contains the decimal number 00000004.00, the following function returns the string “0000000000000000000000000004.0000000000”.
DecimalToString(mylink.mydec,”fix_zero”)If the column mylink.mydec contains the decimal number 00012344.00, the following function returns the string “12344”.
StringToDecimal(mylink.mystring,”ceil”)If the column mylink.mystring contains the string “19982.2276”, and the target is defined as having a precision of 7 and a scale of 2, then the following function returns the decimal 19983.23.
DecimalToTimestamp(mylink.mytimestampdec, “%hh%nn%ss%yy%mm%dd”)If the column mylink.mytimestampdec contains the decimal value 200658580818, then the following function returns the timestamp 1958–08–18 20:06:58:
IsValid(“int8”,mylink.mystring)If the column mylink.mystring contains the string “1”, then the following function returns the value 1
If the column mylink.mystring contains the string “380096.06”, then the following function returns the value 0.
IsValidDate(mylink.mydate) If the column mylink.mydate contains the date 2011-09-13, then the following function returns the value 1.
If the column mylink.mydate contains the string “380096.06”, then the following function returns the value 0, because the converted string is not a valid date.
IsValidDecimal(mylink.mynum) If the column mylink.mynum contains the value 310007.65, then the following function returns the value 1.
If the column mylink.mynum contains the string “wake-robin”, then the following function returns the value 0, because the converted string is not a valid decimal.
IsValidTime(mylink.mytime)If the column mylink.mytime contains the time 23:09:22, then the following function returns the value 1:
If the column mylink.mydate contains the string “IbnKayeed”, then the following function returns the value 0, because the converted string is not a valid time.
IsValidTimestamp(mylink.mytimestamp)If the column mylink.mytimestamp contains the time 2011-09-13 23:09:22, then the following function returns the value 1:
If the column mylink.mytimestamp contains the string “one of two”, then the following function returns the value 0, because the converted string is not a valid timestamp.
StringToTime(mylink.mystring,”%(h,s):$(n,s):$(s,s)”)

Number functions in Datastage

Reading Time: < 1 minute
AsDouble(56/4.32)In the following expression, the input column mynumber contains an integer, but the function outputs a double. If mylink.mynumber contains the value 56, then the following two functions are equivalent, and return the value 1.29629629629629619E+01
AsFloat(56/4.32)In the following expression, the input column mynumber contains an integer, but the function outputs a float. If mylink.mynumber contains the value 56, then the following two functions are equivalent, and return the value 1.29629629629629619E+01
AsInteger(56/4.32)In the following expression, the input column mynumber contains a double, but the function is output an integer. If mylink.mynumber contains the value 56, then the following two functions are equivalent, and return the value 12

NULL handling functions in Datastage

Reading Time: < 1 minute
If IsNotNull(mylink.mycolumn) Then mylink.mycolumn Else “NULL” If the Derivation field for an output column contained the following code, then the Transformer stage checks if the input column named mylink.mycolumn contains a null value. If the input column does not contain a null, the output column contains the value of the input column. If the input column does contain a null, then the output column contains the string NULL
If IsNull(mylink.mycolumn) Then “NULL” Else mylink.mycolumn If the Derivation field for an output column contained the following code, then the Transformer stage checks if the input column named mylink.mycolumn contains a null value. If the input column contains a null, the output column contains the string NULL. If the input column does not contain a null, then the output column contains the value of the input column
NullToEmpty(mylink.mycolumn)Returns an empty string if the input column is null, otherwise returns the input column value
NullToZero(mylink.mycolumn)Returns zero if the input column is null, otherwise returns the input column value
NullToValue(mylink.mycolumn,42)Returns the specified value if the input column is null, otherwise returns the input column value
setnull()Assigns a null value to the target column

Mathematical functions in Datastage

Reading Time: 2 minutes
Abs(12-34)If mylink.number1 contains the number 12 and mylink.number2 contains the number 34, then the following two functions are equivalent, and return the number 22
Ceil(2355.66)If mylink.number contains the number 2355.66, then the following two functions are equivalent, and return the value 2356
Div(100,25)If mylink.dividend contains the number 100, and mylink.divisor contains the number 25, then the following two functions are equivalent, and return the value 4
Floor(203.25)If mylink.number contains the number 203.25, then the following two functions are equivalent, and return the value 203
Max(6,101)If mylink.number1 contains the number 6, and mylink.number1 contains the number 101, then the following two functions are equivalent, and return the value 101
Min(6,101) If mylink.number1 contains the number 6, and mylink.number1 contains the number 101, then the following two functions are equivalent, and return the value 6
Mod(115,25)If mylink.dividend contains the number 115, and mylink.divisor contains the number 25, then the following two functions are equivalent, and return the value 15
Neg(123)If mylink.number contains the number 123, then the following two functions are equivalent, and return the value -123
Pwr(2,3)If mylink.expression contains the number 2, and mylink.power contains the number 3, then the following two functions are equivalent, and return the value 8
Random()Returns a random number between 0 and 232-1
Sqrt(450)If mylink.number contains the number 450, then the following two functions are equivalent, and return the value 21.2132

Datastage Environment Variables

Reading Time: < 1 minute

Basically Environment variable is a predefined variable that we can use while creating DS job. We create/declare these variables in DS. While designing the job we set the properties for these variables. Environmental variables are also called as Global variables.

There are two types of environment variables:

   1. Local Variables
   2.Environmental variables/Global Variables

Local Variables: only for particular job only
Environment Variables: In any job throughout your project in this some default variables are there and also we can define some user defined variables also.
Creating project specific environment variables- Start up Data Stage Administrator.- Choose the project and click the “Properties” button.- On the General tab click the “Environment…” button.- Click on the “User Defined” folder to see the list of job specific environment variables.

Example: To connect to database you need use id , password and schema.

These are constant throughout the project so they will be created as environment variables.

Use them wherever you want with #Variable#.

By using this if there is any change in password or schema no need to worry about all the jobs. Change it at the level of environment variable that will take care of all the jobs.