Monday, December 19, 2011

Enabling connectivity from Infosphere DataStage to Lotus Notes database using ODBC

IBM Lotus Notes is an integrated desktop client for accessing email and other applications on an IBM Lotus Domino server. Important information related to businesses is managed in Lotus Notes databases these days, hence many customers feel the need to process this data for business analytics. Infosphere DataStage supports connectivity to Lotus Notes Databases or Domino databases through the ODBC interface.

Lotus Notes ODBC driver is only available for windows platform, hence Infosphere DataStage installed on windows platform only can be used to connect to Lotus Notes database using ODBC interface.  If your Infosphere DataStage is on a Unix machine, you can use web services hosted on the Domino server to get data into DataStage. There is a step by step guide on the same available on IBM developerworks.
[Tools like Attunity/Iway could also possibly be used to to establish connectivity. Exploring them in later posts]

The procedure assumes Infosphere DataStage version 8.5 and Lotus Notes ODBC driver version 8.5 and is  described for ODBC Enterprise Stage. Other stages like ODBC Connector can also be used, but may have some problems. This procedure assumes that the user configuring has Administrative privileges.

Configure Lotus Notes ODBC Driver on DataStage server


  • Download and install Lotus Notes client which is a pre-requisite for the Lotus Notes ODBC driver
  • Download and install Lotus Notes ODBC driver, also known as Lotus NotesSQL driver.
  • Configure Lotus NoteSQL Authentication List Manager with the id that you want to use.

  • Also connect to the database you want to use, using Lotus Notes client , to confirm connectivity. This step seems unnecessary, but it seems that some files etc are created/updated when the Lotus client first connects to a database.

Configure DataStage server to interact with Desktop

  • Open DataStage dsrpc service in AdministrativeTools->Services and set Allow service to interact with desktop.


  • Also define environment variable DSE_ODBC_NOTESQL_BEHAVIOR and set it to 1.
  • Restart DataStage
  • Create a System DSN for the Lotus Notes ODBC driver. 
  • Create a ODBC Enterprise job with the DSN created. Set user and password to the lotus notes id and password configured in the Lotus NotesSQL Authentication List Manager.


    Trouble Shooting Tips

    • If the job fails with error "Driver's SQLAllocHandle on SQL_HANDLE_DBC failed."
      Locate notes.ini file on your system and give 'Everyone' user permission to read it.

      • For other errors , refer to Notes SQL Help at Lotus Applications->NotesSQL->NotesSQL Help
      • This procedure is in no way complete. Although it works for most cases, you may still get different errors related to security and permission depending on windows security and user privileges etc. For such errors please refer to microsoft help for resolution.

      Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions. The methods described here are my own and not necessarily supported by IBM. 
      IBM, Infosphere and DataStage are registered trademarks of IBM corporation.


      Monday, December 12, 2011

      Enabling connectivity to Ingres from IBM Infosphere DataStage

      Ingres is an open source RDBMS provided by Actian Corporation. Infosphere DataStage connectivity to Ingres database is provided through the ODBC interface. InfoSphere DataStage has comprehensive support for ODBC through the ODBC Stage on server canvas as well as ODBC Enterprise Stage and ODBC Connector Stage on parallel canvas.

      Following are the steps required to connect to Ingres from Infosphere DataStage. The example is for AIX platform using ODBC Enterprise Stage.

      1. Install Ingres
      2. In your Infosphere DataStage job, set the following environment variables II_SYSTEM, PATH and LIBPATH to include Ingres.
             II_SYSTEM=/opt/Ingres/IngresII
         PATH=$II_SYSTEM/ingres/bin:$II_SYSTEM/ingres/utility:$PATH
         LIBPATH=/opt/Ingres/IngresII/ingres/lib:/opt/Ingres/IngresII/ingres/lib/lp64:$LIBPATH

      3. Link libiiodbcdriver.1.a to  libiiodbcdriver.1.so
              ln -s  libiiodbcdriver.1.a libiiodbcdriver.1.so
      [Note: Infosphere DataStage uses DataDirect now Progress Software provided driver manager which looks for *.so libraries, hence the need to link]

      4. Edit the .odbc.ini to add a DSN entry as follows ( Remember to set DriverUnicodeType to 1 ):
      [TEST]
      Driver=/opt/Ingres/IngresII/ingres/lib/libiiodbcdriver.1.so
      Description=
      Vendor=Ingres Corporation
      DriverType=Ingres
      DriverUnicodeType=1
      Server=<hostname>
      Database=<database name>
      ServerType=Ingres
      PromptUIDPWD=N
      WithOption=
      RoleName=
      RolePWD=
      DisableCatUnderscore=N
      AllowProcedureUpdate=N
      UseSysTables=N
      BlankDate=N
      Date1582=N
      CatConnect=N
      Numeric_overflow=N
      SupportIIDECIMAL=N
      CatSchemaNULL=N
      ReadOnly=N
      SelectLoops=N
      ConvertThreePartNames=N
      Group=


      4. Create a  parallel job with ODBC Enterprise Stage. Enter the DSN , user and password. The user and password to use is the Ingres Administrator who owns the instance. This is set during installation. Usually defaults to ingres.

      Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions. The methods described here are my own and not necessarily supported by IBM. 
      IBM, Infosphere and DataStage are registered trademarks of IBM corporation.

      Monday, December 5, 2011

      Bulk loading data to remote Sybase IQ 15.X Server using IBM Infosphere DataStage

      Prior to Sybase IQ version 15.X the bulk loading of data was restricted to local systems because of  the mechanism of the LOAD TABLE command.

      With the release of Sybase IQ 15, a new feature client side load has been introduced which allows LOAD TABLE command to pull data from remote data files.

      Support for this feature has been built into the Sybase Enterprise Stage and Sybase IQ 12 Bulk Load stage to provide users of Infosphere DataStage the capability to load data from DataStage to remote Sybase IQ servers. While Sybase IQ 12 Bulk Load Stage requires minimal configuration to setup this , Sybase Enterprise Stage requires quite a few steps to enable bulk loading to remote Sybase IQ server. The advantage of Sybase Enterprise stage however, is that it uses FIFOs as data file, hence it is much faster than the Sybase IQ 12 Bulk Load stage and hence worth the configuration required.

      Enabling bulk loading to remote Sybase IQ Server 15.X using Sybase Enterprise Stage


      Pre-requisites to using Sybase Enterprise Stage to load data into table on remote Sybase IQ server.

      1. Check that the Sybase IQ server version is 15.1 ESD1 or above : The capability to allow load table from remote data files which are FIFOs is only available from Sybase IQ 15.1 ESD1 onwards. As Sybase Enterprise Stage uses data files which are FIFOs the functionality can only be supported when connecting to Sybase IQ 15.1 ESD1

      2. Install Sybase Open Client SDK 15.5 on Infosphere DataStage engine tier and configure the $SYBASE/interfaces file to add the remote Sybase IQ server ipaddress and port number : The Sybase Enterprise Stage is based on Sybase Open Client API's.However the remote client load feature is only enabled for ODBC interface to Sybase IQ. Hence the Sybase Enterprise Stage converts the supplied Sybase IQ server name to equivalent ODBC Connection String. For this it gets the CS_CONNECTED_ADDR option for the connection using ct_con_props.

      3. Install the Sybase IQ network client 15.1 ESD1 : This software installs the Sybase IQ ODBC driver and the dbisql utility, which supports the new client side load feature.

      Enable remote client load feature on Sybase IQ 15.X server

      Once the pre-requistes are met, the following needs to be done to enable the remote client
      load feature on Sybase IQ server.

      1. Enable the allow_read_client_file and allow_write_client_file options of the Sybase IQ 15.X server. Set these options once on every IQ server. Enable the allow_read_client_file server option property using the isql or dbisql utility:

         set option allow_read_client_file=on
         set option allow_write_client_file=on   
         GRANT READCLIENTFILE TO <group | user>

      Configure Sybase Enterprise Stage to perform bulk load to remote Sybase IQ 15.X server

      1.Use the script $DSHOME/../DSComponents/etc/install.sybase to change current Sybase Enterprise library version to 15. By default the Sybase Enterprise library version is 12.

      2. On the DataStage Server, define a environment variable APT_SYBASE_REMOTE_LOAD and set it to When this environment variable is set, the Sybase Enterprise Stage will try to load the given table using the new syntax 'LOAD TABLE <mytable> USING CLIENT FILE'.

      3.On the DataStage Server, set the environment variables to include Sybase IQ
      SYBASE=<Path to SDK 15.5>
      SYBASE_OCS=OCS-15_0
      PATH=$SYBASE/$SYBASE_OCS/bin:<Sybase IQ network client>/IQ-15_2/bin
      LIBPATH=$SYBASE/$SYBASE_OCS/lib:<Sybase IQ network client>/IQ-15_2/lib

      Now the Sybase Enterprise Stage is ready to load data into the remote Sybase IQ server.

      4. Create a Parallel job with Sybase Enterprise Stage as target stage . Set write Method=IQ Write.

      Enabling bulk loading to remote Sybase IQ Server 15.X using Sybase IQ 12 Bulk Load Stage

      The Sybase IQ 12 Bulk Load Stage uses flat files as data files. Hence it is able to perform bulk load to remote Sybase IQ server from version 15.0 onwards.



      1.To perform bulk load to remote Sybase IQ server, create a job with Sybase IQ 12 Bulk Load Stage as target stage and select Load Method=AutoLoad via ODBC. The client side load feature is only available for ODBC interface to Sybase IQ, hence only when Load Method is set to AutoLoad via ODBC, bulk load to remote Sybase IQ server is possible.

      2. Install  Sybase IQ network client 15.X on Infosphere DataStage and configure a ODBC DSN for the remote Sybase IQ 15.X server. The Sybase IQ 12 Bulk Load Stage checks both the ODBC driver version as well as the Sybase IQ server version it connects to before using the new client side load syntax.
      e.g DSN for AIX 64 is given below

      [iqdemo]
      Driver=/home/sybase/IQ15.1/IQ-15_1/lib64/libdbodbc11.so
      Description=SybaseIQ
      EngineName=<hostname>_iqdemo
      DatabaseName=iqdemo
      CommLinks=TCPIP{HOST=<ipaddress>:<port>;}
      UID=
      PWD=
      AutoStop=no
      AutoPreCommit=Y
      DriverUnicodeType=1


      3.The client side load feature must be enabled on the Sybase IQ server.  If the feature is disabled, Sybase IQ 12 Bulk Load will detect it and revert to normal LOAD TABLE command.
      To enable the allow_read_client_file server option property use the isql or dbisql utility:

         set option allow_read_client_file=on
         set option allow_write_client_file=on   
         GRANT READCLIENTFILE TO <group | user>






      Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions. The methods described here are my own and not necessarily supported by IBM. 
      IBM, Infosphere and DataStage are registered trademarks of IBM corporation.

      Tuesday, November 29, 2011

      Enabling bulk load to remote IQ server 12.X using IBM Infosphere DataStage


      IBM Infosphere DataStage provides rich connectivity to Sybase® IQ database server through Sybase Enterprise stage and Sybase IQ Bulk Load stage. But it can be a problem to bulk load data from Infosphere DataStage to Sybase® IQ version 12.X , specially if the Sybase® IQ server is remote to Infosphere DataStage server.

      Sybase® IQ version 12.X uses LOAD TABLE command for bulk loading of data.  The data file supplied to the LOAD TABLE command should be accessible from the Sybase® IQ 12.X server on which the command runs. When this command is executed remotely, the data file is expected to be present on the Sybase® IQ 12.X server for which the load is issued.
      Note: Sybase® IQ version 15 and later has extended LOAD TABLE capability which does not have this restriction.More on this in later posts

      A popular way of loading to a remote Sybase® IQ server from Infosphere DataStage is to use the Sybase IQ 12 Bulk Loader Stage and a NFS shared partition between the Infosphere DataStage server and Sybase® IQ server. This stage exports the data to a flat file and NFS sharing the flat file between the Infosphere DataStage server and the remote Sybase® IQ server makes the data file locally available to the remote IQ server. Although this method works it has some drawbacks in terms of  performance as the exporting of data to a flat file and then loading the table is sequential and done one after the another. Also, the reading of NFS shared file is subject to performance and network issues.

      Sybase Enterprise stage exports data and bulk loads the table parallelly . However it uses a FIFO as the data file. As the data file is a FIFO , it is not possible to share it using NFS.
      It is possible to bulk load to a remote Sybase® IQ server using Sybase Enterprise stage if the remote Sybase® IQ server is installed on the same platform as the Infosphere DataStage server and hence can be made part of a node pool on the Infosphere DataStage server. 
      When this node pool is added as a constraint on the Sybase Enterprise stage, the Sybase Enterprise stage runs on the remote Sybase® IQ server and hence the datafile gets created such that it is accessible to the Sybase® IQ server. 

      Following is a step by step method to enable the Sybase Enterprise Stage to bulk load to remote Sybase® IQ server. The method assumes prior knowledge of Infosphere DataStage configuration and some Sybase IQ administration.


      • Install Infosphere DataStage components on remote Sybase® IQ server and configure rsh etc as specified in InfoSphere DataStage documentation to make a grid environment for Infosphere DataStage.
      • Install Sybase SDK on Infosphere DataStage server and configure the Sybase interfaces file to access remote Sybase® IQ server. [Note]: The Sybase SDK path on Infosphere DataStage server should be the same as that on the remote Sybase® IQ server. e.g if remote Sybase® IQ server has sybase installed in /home/sybase/iq, then on Infosphere DataStage server also the Sybase SDK should be installed in the same path.
      • Create a new configuration file(APT_CONFIG_FILE ) on Infosphere DataStage and add the remote Sybase® IQ server as one of the nodes. Add that node to a node pool called "sybase", e.g
      { 
      node "DataStage_Server"
        {
          fastname "DataStage_Server"
          pool "" "b"
          resource scratchdisk "/home/kartgarg/sand751/scratch" { }
          resource disk "/home/kartgarg/sand751/pds_files/node2" 
          { pool "" "export" "node2" "node2a" "nodeb" "sasdataset" }
          resource disk "/home/kartgarg/sand751/pds_files/node2" 
          { pool "" "export" "node2" "node2b" "nodeb" "sasdataset" }
          resource sas "" { }
        }
        node "DataStage_Server_node2"
        {
          fastname "Remote_IQ_Server"
          pool "" "b" "sybase"
          resource scratchdisk "/home/kartgarg/sand751/scratch" { }
          resource disk "/home/kartgarg/sand751/pds_files/node3" 
          { pool "" "export" "node3" "node3a" "nodeb" "sasdataset" }
          resource disk "/home/kartgarg/sand751/pds_files/node3" 
          { pool "" "export" "node3" "node3b" "nodeb" "sasdataset" }
          resource sas "" { }
        }
      }
      
      • On the job , open Sybase Enterprise stage , set Write Method to IQ write and then select the Stage tab. Under node pool and resource contraint , select node pool "sybase". 
      •  Compile and run the job.

        Points to note

        •  The method is only applicable if Sybase® IQ server and Infosphere DataStage server are installed on the same platform.
        • The Sybase® IQ server should reside in its own node pool and that pool should have one and only one server, i.e the IQ server.
        • The Sybase® IQ server should be accessible from Information Server through ct-library. Hence Sybase® SDK should be installed on the Information Server. The interfaces file should have an entry for the remote Sybase® IQ server.
        • Using a node map constraint instead of a node pool will not work as node map constraint are not handled by the Sybase Enterprise stage. 

        Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions. The methods described here are my own and not necessarily supported by IBM. 
        IBM, Infosphere and DataStage are registered trademarks of IBM corporation.