Questions tagged [datastage]

DataStage is the ETL (Extract, Transform, Load) component of the IBM InfoSphere Information Server suite. It allows the user to integrate various data sources and targets in an enterprise environment as a GUI based client tool.

DataStage is the ETL (Extract, Transform, Load) component of the IBM InfoSphere Information Server suite. It allows the user to integrate various data sources and targets in an enterprise environment as a GUI based client tool. Data Sources/Targets could be database tables, flat files, datasets, csv files etc. Basic design paradigm consists of a unit of work called as DataStage job. Multiple jobs can be controlled and conditionally sequenced using 'Sequences'.

IBM® InfoSphere® DataStage® integrates data across multiple systems using a high performance parallel framework, and it supports extended metadata management and enterprise connectivity. The scalable platform provides more flexible integration of all types of data, including big data at rest (Hadoop-based) or in motion (stream-based), on distributed and mainframe platforms.

Read more here

InfoSphere DataStage provides these features and benefits:

  • Powerful, scalable ETL platform
  • Support for big data and Hadoop
  • Near real-time data integration
  • Workload and business rules management
  • Ease of use

Support for big data and Hadoop

  • Includes support for IBM InfoSphere BigInsights, Cloudera, Apache and Hortonworks Hadoop Distributed File System (HDFS).
  • Offers Balanced Optimization for Hadoop capabilities to push processing to the data and improve efficiency.
  • Supports big-data governance including features such as impact analysis and data lineage

Powerful, scalable ETL platform

  • Manages data arriving in near real-time as well as data received on a periodic or scheduled basis.

  • Provides high-performance processing of very large data volumes.

  • Leverages the parallel processing capabilities of multiprocessor hardware platforms to help you manage growing data volumes and shrinking batch windows.

  • Supports heterogeneous data sources and targets in a single job including text files, XML, ERP systems, most databases (including partitioned databases), web services, and business intelligence tools.

Near real-time data integration

  • Captures messages from Message Oriented Middleware (MOM) queues using Java Message Services (JMS) or WebSphere MQ adapters, allowing you to combine data into conforming operational and historical analysis perspectives.

  • Provides a service-oriented architecture (SOA) for publishing data integration logic as shared services that can be reused over the enterprise.

  • Can simultaneously support high-speed, high reliability requirements of transactional processing and the large volume bulk data requirements of batch processing.

Ease of use

  • Includes an operations console and interactive debugger for parallel jobs to help you enhance productivity and accelerate problem resolution.

  • Helps reduce the development and maintenance cycle for data integration projects by simplifying administration and maximizing development resources.

  • Offers operational intelligence capabilities, smart management of metadata and metadata imports, and parallel debugging capabilities to help enhance productivity when working with partitioned data.

609 questions
0
votes
0 answers

How to catch or skip error lines in Datastage when inserting lines in ODBC

I'am using Datastage 11.7 , I want to insert lines my odbc table , however while inserting I want to skip or catch the line that goes down while inserting all the lines in my odbc , all of that so my flow won't go and will continue inserting even if…
aze
  • 39
  • 9
0
votes
1 answer

How to remove double Quotes In DataStage using a transformer stage?

We receiving Input data like below “VENKATA,KRISHNA” I want output like below VENKATA,KRISHNA Can anyone help me with this
Krishna
  • 25
  • 4
0
votes
1 answer

call AWS POST API with AWS Signature Version 4

did someone already called an AWS Rest API with AWS Signature Version 4 ? I'm not sure how to generate this using the hierarchical stage Calling the Rest API ? Thank You -
Gius
  • 21
  • 1
  • 7
0
votes
1 answer

Datastage Column Mapping logic Based on column Value (Or any SQL function to do the same)

I'm trying to find a solution in datastage (Or in SQL) - without having to use a bunch of if/else conditions - where I can map value of one column based on value of another column. Example - Source File…
0
votes
2 answers

Datastage Basic - How to Supress "previously undefined" Warning Message in Transform Function

I have a datastage routine transform function, that does something a bit more complicated than the following(Takes in Arg1 which can be null): Ans = Len(Arg1) In certain situations I am calling this transform function in a transform stage and for…
David Rogers
  • 2,601
  • 4
  • 39
  • 84
0
votes
1 answer

How to give a job parameters previously calculated with other job?

I'm working with DataStage 11.3 with parallel jobs. My input is a date and I use the "DateOffsetByComponents" function in a -transformer- stage to obtain 4 dates with different rules, these 4 results ends in different -sequential file- stages. The…
PinkAndi
  • 1
  • 1
0
votes
1 answer

Is there anyway to get the DataStage jobs path from command line?

I have gone through all the dsjob commands and output of them to see if there is any command which returns the DataStage jobs path. But there is no luck for me. If the DataStage job is in /Jobs/Oracle/Export, I would like to get the path name by…
Naveen Reddy CH
  • 799
  • 2
  • 13
  • 23
0
votes
1 answer

Datastage: Split string by every tenth delimiter

I have a parameter which is long string delimited by comma(~1500 values). Like 1,2,3,4,5,etc . This parameter transmitted in parallel job for processing, 1 value at a time. I need to transmitted not 1 value, but 10 at once. I tried with the help of…
r1cken
  • 1
0
votes
1 answer

How to connect Microsoft SQL Server (IaaS) using API from IBM Datastage 11.7.1.2

We are trying to connect to Microsoft SQL Server installed in an Azure VM (IaaS) from Datastage using API. Currently, we are using JDBC connector to connect to Microsoft SQL Server (IaaS) using a service account and its password. But, on a new…
Mohan
  • 21
  • 1
0
votes
1 answer

How can I use use cpdclt dsjob for DaaS (DataStage as a Service)?

I would like to execute "cpdctl dsjob" explained in the document below URL. https://dataplatform.cloud.ibm.com/docs/content/dstage/dsnav/topics/cli.html I download cpdctl from below URL but can not use dsjob…
0
votes
1 answer

Tearadata Connection Error In IBM Datastage

Whenever I run the DS job I am getting following issue "Error loading connector library cctera12.dll. The specified module could not be found. (CC_LoadedConnector::loadLibrary, file CC_ConnectorFactory.cpp, line 1,536)" But cctera12.dll this library…
SMValiAli
  • 85
  • 1
  • 6
0
votes
1 answer

Which connector in datastage can be used to connect to AWS Oracle Rds

Our OnPrem database is migrating to Oracle AWS RDS. I am struggling to find which datastage connector can be used?
keshav
  • 1
0
votes
1 answer

How to identify and convert a series of numbers in DataStage?

I need to identify the values in a field which are of 9 series and greater than or equal to 99999 and convert them to 0s in DataStage i.e atleast first 5bytes of the field are 9s. Here are some examples to explain the situation better 999.00 - Don't…
0
votes
1 answer

Update table in Snowflake from DataStage taking long time to complete

I am updating a table from DataStage job with write mode "Update". The jobs running for long time, taking appx. 10 hours to complete. I have 4 694 233 records in table and trying to load appx 15k records from Dataset. Not sure why it is taking that…
0
votes
1 answer

How could i set yesterday value as default value for a numeric parameter in datastage?

I have a numeric datatype parameter in Datastage. *parameter name: VAR_ETL_DATE *format:YYYYMMDD *ex:20210612 How could i set yesterday value as default value for this parameter? Format:YYYYMMDD (numeric) *Example: Today:20210824 *When i run a…