Questions tagged [datastage]

DataStage is the ETL (Extract, Transform, Load) component of the IBM InfoSphere Information Server suite. It allows the user to integrate various data sources and targets in an enterprise environment as a GUI based client tool.

DataStage is the ETL (Extract, Transform, Load) component of the IBM InfoSphere Information Server suite. It allows the user to integrate various data sources and targets in an enterprise environment as a GUI based client tool. Data Sources/Targets could be database tables, flat files, datasets, csv files etc. Basic design paradigm consists of a unit of work called as DataStage job. Multiple jobs can be controlled and conditionally sequenced using 'Sequences'.

IBM® InfoSphere® DataStage® integrates data across multiple systems using a high performance parallel framework, and it supports extended metadata management and enterprise connectivity. The scalable platform provides more flexible integration of all types of data, including big data at rest (Hadoop-based) or in motion (stream-based), on distributed and mainframe platforms.

Read more here

InfoSphere DataStage provides these features and benefits:

  • Powerful, scalable ETL platform
  • Support for big data and Hadoop
  • Near real-time data integration
  • Workload and business rules management
  • Ease of use

Support for big data and Hadoop

  • Includes support for IBM InfoSphere BigInsights, Cloudera, Apache and Hortonworks Hadoop Distributed File System (HDFS).
  • Offers Balanced Optimization for Hadoop capabilities to push processing to the data and improve efficiency.
  • Supports big-data governance including features such as impact analysis and data lineage

Powerful, scalable ETL platform

  • Manages data arriving in near real-time as well as data received on a periodic or scheduled basis.

  • Provides high-performance processing of very large data volumes.

  • Leverages the parallel processing capabilities of multiprocessor hardware platforms to help you manage growing data volumes and shrinking batch windows.

  • Supports heterogeneous data sources and targets in a single job including text files, XML, ERP systems, most databases (including partitioned databases), web services, and business intelligence tools.

Near real-time data integration

  • Captures messages from Message Oriented Middleware (MOM) queues using Java Message Services (JMS) or WebSphere MQ adapters, allowing you to combine data into conforming operational and historical analysis perspectives.

  • Provides a service-oriented architecture (SOA) for publishing data integration logic as shared services that can be reused over the enterprise.

  • Can simultaneously support high-speed, high reliability requirements of transactional processing and the large volume bulk data requirements of batch processing.

Ease of use

  • Includes an operations console and interactive debugger for parallel jobs to help you enhance productivity and accelerate problem resolution.

  • Helps reduce the development and maintenance cycle for data integration projects by simplifying administration and maximizing development resources.

  • Offers operational intelligence capabilities, smart management of metadata and metadata imports, and parallel debugging capabilities to help enhance productivity when working with partitioned data.

609 questions
1
vote
1 answer

Append Files in Unix - Cat doesn't work

I am trying to append files with a unix command in datastage and its not working. Unix commands does work. For examples if there are 5 files in a directory like /a/file1.txt /a/file2.txt /a/file3.txt /a/file4.txt /a/file5.txt Second files is not…
eskay
  • 21
  • 1
1
vote
0 answers

Getting error for Data stage compare command line tool

I am using a utility provided in Data Stage 9.1 diffapicmdline.exe to compare two jobs from different environment. I am using following batch script code to read the job names from text file in loop: @echo off SET var= for /f "delims=" %%i in…
dna
  • 483
  • 3
  • 10
  • 32
1
vote
0 answers

how to pass Empty string string into a xml file using Datastage?

This is the XSD which is being used in the job to create xml output file fetching records from tables:
1
vote
0 answers

How to ETL CLOB data type using DataStage

I have a table to be be fetched, One of the column in particular table contains HTML data stored as a CLOB, The length is '1048576' In my ETL job, I have replaced CLOB with LongVarChar of same size (1048576) as CLOB is not defined in data Stage but…
andruX
  • 117
  • 1
  • 1
  • 13
1
vote
1 answer

I need to check range of mixed substring

I have a string and say I want to check the last 3 digits of the string for some range. if the string is like sasdaX01, I need to check the last three digits of the string are between X01-X50. ANy help would be highly appreciated.
Ram Esh
  • 11
  • 2
1
vote
1 answer

Datastage - run user defined sql query file using odbc connector

Using DataStage, I have to read a sequential file, which contains one sql statement, run that sql statement and output the results in a sequential file. This is what I've tried : Using an Oracle connector, I simply set the option to "Read Selected…
LatinCanuck
  • 454
  • 2
  • 10
  • 29
1
vote
3 answers

how to get only one record for each duplicate row of the id in datastage

I have **Table** Name,RNo,M1,M2,M3,M4 Raj,1,25,26,Null,Null **File** Name,RNo,M,T Raj,1,100,M3 Raj,1,200,M4 If i join table with File Output needed as Name,RNo,M1,M2,M3,M4 Raj,1,25,26,100,200 As the data is getting from file i cannot get…
Bobby
  • 320
  • 5
  • 23
1
vote
2 answers

DataStage Job stucks with warnings

i am trying to stage a dataset from source to my server, When I run my job in DataStage, It keeps stucked with no errors. All I see is a warning which says: When checking operator: When binding output interface field "DRIVERS" to field "DRIVERS":…
andruX
  • 117
  • 1
  • 1
  • 13
1
vote
2 answers

Reducing data with data stage

I've been asked to reduce an existing data model using Data Stage ETL. It's more of an exercice and a way to get to know this program which I'm very new to. Of course, the data shall be reduced following some functionnal rules. Table : MEMBERSHIP…
1
vote
2 answers

Job parameters in IBM datastage job

IBM datastage is a new tool for me and I'm unable to find any good pictorial and step by step tutorials for it. I'm having trouble in using Job Parameters in Datastage. Anyone please help me how can we use IBM Datastage job parameters and…
Bint-e-Adam
  • 13
  • 1
  • 1
  • 7
1
vote
2 answers

Datastage Designer and Sequential File Location

Having been thrown into the deep end with DataStage/QualityStage, I'm working on an IBM tutorial on Parallel Jobs and having a hard time getting traction. I think some of the problem is that I'm used to Microsoft conventions. I am even finding the…
Echo Train
  • 99
  • 11
1
vote
0 answers

How to import a function and a custom type into DataStage?

I've been provided an Oracle package as origin for an ETL process. The package contains one function that takes a date as parameter and returns a collection of a custom type. My question is, how can I import this into a DataStage job? Importing the…
Léster
  • 1,177
  • 1
  • 17
  • 39
1
vote
2 answers

How to removing spacing in SQL

I have data in DB2 then i want to insert that data to SQL. The DB2 data that i had is like : select char('AAA ') as test from Table_1 But then, when i select in SQL after doing insert, the data become like this. select test from Table_1 result…
fushia03
  • 11
  • 1
1
vote
0 answers

how to read unicode characters in data stage that is NON NLS

I am having Unicode characters present in a column with varchar2 as datatype. Table is present in Oracle Db. Oracle has AL32UTF8 set. But my data stage engine is installed without NLS. Is there any way I can read these Unicode characters without…
user3635916
  • 41
  • 1
  • 8
1
vote
1 answer

File Splitting with DataStage (8.5)

I have a job that successfully produces a sequential file (CSV) output with some hundred million rows, can someone provide an example where the output is written to a hundred separate sequential files, each with a million rows? What does the…
lojkyelo
  • 121
  • 3
  • 15