Questions tagged [datastage]

DataStage is the ETL (Extract, Transform, Load) component of the IBM InfoSphere Information Server suite. It allows the user to integrate various data sources and targets in an enterprise environment as a GUI based client tool.

DataStage is the ETL (Extract, Transform, Load) component of the IBM InfoSphere Information Server suite. It allows the user to integrate various data sources and targets in an enterprise environment as a GUI based client tool. Data Sources/Targets could be database tables, flat files, datasets, csv files etc. Basic design paradigm consists of a unit of work called as DataStage job. Multiple jobs can be controlled and conditionally sequenced using 'Sequences'.

IBM® InfoSphere® DataStage® integrates data across multiple systems using a high performance parallel framework, and it supports extended metadata management and enterprise connectivity. The scalable platform provides more flexible integration of all types of data, including big data at rest (Hadoop-based) or in motion (stream-based), on distributed and mainframe platforms.

Read more here

InfoSphere DataStage provides these features and benefits:

  • Powerful, scalable ETL platform
  • Support for big data and Hadoop
  • Near real-time data integration
  • Workload and business rules management
  • Ease of use

Support for big data and Hadoop

  • Includes support for IBM InfoSphere BigInsights, Cloudera, Apache and Hortonworks Hadoop Distributed File System (HDFS).
  • Offers Balanced Optimization for Hadoop capabilities to push processing to the data and improve efficiency.
  • Supports big-data governance including features such as impact analysis and data lineage

Powerful, scalable ETL platform

  • Manages data arriving in near real-time as well as data received on a periodic or scheduled basis.

  • Provides high-performance processing of very large data volumes.

  • Leverages the parallel processing capabilities of multiprocessor hardware platforms to help you manage growing data volumes and shrinking batch windows.

  • Supports heterogeneous data sources and targets in a single job including text files, XML, ERP systems, most databases (including partitioned databases), web services, and business intelligence tools.

Near real-time data integration

  • Captures messages from Message Oriented Middleware (MOM) queues using Java Message Services (JMS) or WebSphere MQ adapters, allowing you to combine data into conforming operational and historical analysis perspectives.

  • Provides a service-oriented architecture (SOA) for publishing data integration logic as shared services that can be reused over the enterprise.

  • Can simultaneously support high-speed, high reliability requirements of transactional processing and the large volume bulk data requirements of batch processing.

Ease of use

  • Includes an operations console and interactive debugger for parallel jobs to help you enhance productivity and accelerate problem resolution.

  • Helps reduce the development and maintenance cycle for data integration projects by simplifying administration and maximizing development resources.

  • Offers operational intelligence capabilities, smart management of metadata and metadata imports, and parallel debugging capabilities to help enhance productivity when working with partitioned data.

609 questions
1
vote
3 answers

Datastage list jobs

I am trying to execute a command from datastage job for list listing all the jobs in a project which are in aborted/crashed/stopped status. The command I am using is given below. /opt/IBM/InfoSphere/Server/DSEngine/bin/dsjob -domain XXXXX:9445 …
GIN
  • 111
  • 1
  • 10
1
vote
0 answers

Recursive query in Hive/IBM DataStage

I'm new in Hive and Datastage development. I'm trying to do a recursive query like we can do in Orcale with connect by or with clause, but Hive doesn't support recursive queries. Is there any way to do this in Hive? Or to do the logic through IBM…
1
vote
3 answers

How to Throttle DataStage

I work on a project where we run a number of DataStage sequences can be run in parallel, one in particular is poorly performing and takes a lot of resources, impacting the shared environment. Performance tuning initiative is in progress but will…
MattH
  • 4,166
  • 2
  • 29
  • 33
1
vote
0 answers

Get "Job Wave Number" in DataStage Using "XMETA"

Is there a way to get the Job Wave Number of a job in DataStage - using an SQL statement on the DataStage Repo DB "XMETA"? I was able to get some details of a job using the below statement but I couldn't find a way to get the Job Wave…
Xtigyro
  • 391
  • 2
  • 5
  • 13
1
vote
1 answer

How to solve the below scenario using transformer loop or anything in datastage

My data is like below in one column coming from a file. Source_data---(This is column name) CUSTOMER 15 METER 8 METERStatement 1 READING 1 METER 56 Meterstatement 14 Reading 5 Reading 6 Reading 7 CUSTOMER 38 METER 24 METERStatement 1 READING…
Sean
  • 37
  • 4
1
vote
3 answers

Datastage issue in loading zero

Using Datastage 11.5.0.2, Jobs failed when it try to load the data as "0" into DATE FIELD (DB2).. In source DB, the column is VARCHAR whereas in target it is DATE field.. the only value in source which failed to load is 0. how to resolve.. any idea…
Vignesh
  • 375
  • 1
  • 2
  • 13
1
vote
1 answer

Bulk insert in datastage using teradata connector

I am new to datastage, I created a simple job to get data from .ds file and load it in teradata using teradata connector, in the properties of teradata connector I set the access_method=Bulk, max_session=2, min_session=1,load_type=load,…
KeenLearner
  • 685
  • 1
  • 8
  • 25
1
vote
1 answer

how to concatenate or append current value with existing value in Datastage

I need achive below requirement i.e Input -- at very first time Order value 1111 aaa 222 bbb 333 ccc in the target (Insert) I will have Order value Order value 1111 aaa 222 bbb 333 ccc ----------Input -- at second…
1
vote
1 answer

Datastage invoked through PuTTY and Java REST web service

I am trying to invoke from a Java REST web service an instance of IBM Datastage (IBM Information Server). I write the following code using PuTTY and the job sequence is started correctly.…
GGG
  • 49
  • 1
  • 10
1
vote
0 answers

Datastage job to read an empty file

What changes should I make in a datastage job in order to run a job successfully even with a empty input file. I have a job that reads a file as an input and undergo into some transformations and will give an end file. I want to make the job run…
veena g
  • 15
  • 4
1
vote
0 answers

What are the options to load data from Datastage ETL tool into Google Cloud Storage and Google BigQuery

What are the options to load data using Datastage ETL tool into Google Cloud Storage and Google BigQuery? I see that a third party provides 'Simba' which enable ODBC/JDBC connectors for querying data from BigQuery. But I am looking doing a direct…
1
vote
0 answers

DataStage Parsing Input String to Date Format

I have a string input format of i.e 'Thu, 30 Nov, 2017' I need to transform this into a more Oracle database friendly Date type format of something like '11-30-2017' or '11/30/2017'. I started in the path of Convert('…
Nk.Pl
  • 131
  • 2
  • 16
1
vote
1 answer

How to convert Unicode ebcdic string to ascii string in datastage?

I am reading ebcdic format mainframe data as Unicode string for some reason. When I write that data after some transformation, I am in the need of corresponding ascii data.
1
vote
2 answers

Connecting DataStage and SAS

I'm using Datastage 11.3 and I need to call a SAS process from DataStage. My question is: Datastage and SAS need to be installed in the same server? What if these tools are not installed in the same place? Thanks! PD: sorry for my english :s
aroa
  • 21
  • 3
1
vote
1 answer

DataDirect Azure ODBC Connection Refused

I'm trying to establish a connection to my Azure database using DataDirect ODBC driver but I'm getting this error. Src_ODBC_Unld_iMIS_Name_All: ODBC function "SQLConnect" reported: SQLSTATE = 08001: Native Error Code = 0: Msg = [IBM(DataDirect…
raginggoat
  • 3,570
  • 10
  • 48
  • 108