In order to use a JDBC driver, you need to download the JDBC and set up the configuration files (see here). In this post, we will discuss how to add Redshift JDBC driver to DataStage server and configure it.
(1) Download Redshift JDBC driver from here. In my experience, the latest version (4.2) was not compatible. I had to use the older version (4.1).
(2) It is recommended to install the driver to the tmp folder first and move to the 3rd-party-drivers folder. In this example, I set the installation path as /opt/IBM/InformationServer/Server/DSEngine/. You need to check your installation path.
mv /tmp/RedshiftJDBC42-220.127.116.111.jar /opt/IBM/InformationServer/Server/DSEngine/3rd-party-drivers/
(3) Once the driver is moved to the 3rd-party-drivers folder, you need to edit the isjdbc.config file in the DSEngine folder. The config file requires two parameters, ClASSPATH and CLASSNAME. Both can be obtained in the Redshift JBDC driver download page here.
CLASSPATH: Add the path to the JBDC jar file as in ‘/opt/IBN/InformationServer/Server/DSEngine/3rd-party-drivers/RedshiftJDBC42-18.104.22.1681.jar’
CLASSNAME: Add the class name as in the documentation like ‘com.amazon.redshift.jdbc’
To insert or update records, make sure to tick auto-commit in the JDBC stage. If this option is not ticked, it will lock the table and the job hangs.
Inserting or updating records in Redshift through JDBC (or even ODBC) driver with DataStage is extremely slow. To achieve fast load, you need to build the jobs that do the best Redshift practice (Loading Data into Redshift), which is not an easy task. DataStage is not very compatible with the AWS ecosystem. On the other hand, Informatica provides the excellent Redshift connector that works like a charm.