IBM Infosphere DataStage provides rich connectivity to Sybase® IQ database server through Sybase Enterprise stage and Sybase IQ Bulk Load stage. But it can be a problem to bulk load data from Infosphere DataStage to Sybase® IQ version 12.X , specially if the Sybase® IQ server is remote to Infosphere DataStage server.
Sybase® IQ version 12.X uses LOAD TABLE command for bulk loading of data. The data file supplied to the LOAD TABLE command should be accessible from the Sybase® IQ 12.X server on which the command runs. When this command is executed remotely, the data file is expected to be present on the Sybase® IQ 12.X server for which the load is issued.
Note: Sybase® IQ version 15 and later has extended LOAD TABLE capability which does not have this restriction.More on this in later posts
A popular way of loading to a remote Sybase® IQ server from Infosphere DataStage is to use the Sybase IQ 12 Bulk Loader Stage and a NFS shared partition between the Infosphere DataStage server and Sybase® IQ server. This stage exports the data to a flat file and NFS sharing the flat file between the Infosphere DataStage server and the remote Sybase® IQ server makes the data file locally available to the remote IQ server. Although this method works it has some drawbacks in terms of performance as the exporting of data to a flat file and then loading the table is sequential and done one after the another. Also, the reading of NFS shared file is subject to performance and network issues.
Sybase Enterprise stage exports data and bulk loads the table parallelly . However it uses a FIFO as the data file. As the data file is a FIFO , it is not possible to share it using NFS.
It is possible to bulk load to a remote Sybase® IQ server using Sybase Enterprise stage if the remote Sybase® IQ server is installed on the same platform as the Infosphere DataStage server and hence can be made part of a node pool on the Infosphere DataStage server.
When this node pool is added as a constraint on the Sybase Enterprise stage, the Sybase Enterprise stage runs on the remote Sybase® IQ server and hence the datafile gets created such that it is accessible to the Sybase® IQ server.
Following is a step by step method to enable the Sybase Enterprise Stage to bulk load to remote Sybase® IQ server. The method assumes prior knowledge of Infosphere DataStage configuration and some Sybase IQ administration.
- Install Infosphere DataStage components on remote Sybase® IQ server and configure rsh etc as specified in InfoSphere DataStage documentation to make a grid environment for Infosphere DataStage.
- Install Sybase SDK on Infosphere DataStage server and configure the Sybase interfaces file to access remote Sybase® IQ server. [Note]: The Sybase SDK path on Infosphere DataStage server should be the same as that on the remote Sybase® IQ server. e.g if remote Sybase® IQ server has sybase installed in /home/sybase/iq, then on Infosphere DataStage server also the Sybase SDK should be installed in the same path.
- Create a new configuration file(APT_CONFIG_FILE ) on Infosphere DataStage and add the remote Sybase® IQ server as one of the nodes. Add that node to a node pool called "sybase", e.g
{
node "DataStage_Server" { fastname "DataStage_Server" pool "" "b" resource scratchdisk "/home/kartgarg/sand751/scratch" { } resource disk "/home/kartgarg/sand751/pds_files/node2" { pool "" "export" "node2" "node2a" "nodeb" "sasdataset" } resource disk "/home/kartgarg/sand751/pds_files/node2" { pool "" "export" "node2" "node2b" "nodeb" "sasdataset" } resource sas "" { } } node "DataStage_Server_node2" { fastname "Remote_IQ_Server" pool "" "b" "sybase" resource scratchdisk "/home/kartgarg/sand751/scratch" { } resource disk "/home/kartgarg/sand751/pds_files/node3" { pool "" "export" "node3" "node3a" "nodeb" "sasdataset" } resource disk "/home/kartgarg/sand751/pds_files/node3" { pool "" "export" "node3" "node3b" "nodeb" "sasdataset" } resource sas "" { } } }
- On the job , open Sybase Enterprise stage , set Write Method to IQ write and then select the Stage tab. Under node pool and resource contraint , select node pool "sybase".
- Compile and run the job.
Points to note
- The method is only applicable if Sybase® IQ server and Infosphere DataStage server are installed on the same platform.
- The Sybase® IQ server should reside in its own node pool and that pool should have one and only one server, i.e the IQ server.
- The Sybase® IQ server should be accessible from Information Server through ct-library. Hence Sybase® SDK should be installed on the Information Server. The interfaces file should have an entry for the remote Sybase® IQ server.
- Using a node map constraint instead of a node pool will not work as node map constraint are not handled by the Sybase Enterprise stage.