Questions tagged [datastage]

DataStage is the ETL (Extract, Transform, Load) component of the IBM InfoSphere Information Server suite. It allows the user to integrate various data sources and targets in an enterprise environment as a GUI based client tool.

DataStage is the ETL (Extract, Transform, Load) component of the IBM InfoSphere Information Server suite. It allows the user to integrate various data sources and targets in an enterprise environment as a GUI based client tool. Data Sources/Targets could be database tables, flat files, datasets, csv files etc. Basic design paradigm consists of a unit of work called as DataStage job. Multiple jobs can be controlled and conditionally sequenced using 'Sequences'.

IBM® InfoSphere® DataStage® integrates data across multiple systems using a high performance parallel framework, and it supports extended metadata management and enterprise connectivity. The scalable platform provides more flexible integration of all types of data, including big data at rest (Hadoop-based) or in motion (stream-based), on distributed and mainframe platforms.

Read more here

InfoSphere DataStage provides these features and benefits:

  • Powerful, scalable ETL platform
  • Support for big data and Hadoop
  • Near real-time data integration
  • Workload and business rules management
  • Ease of use

Support for big data and Hadoop

  • Includes support for IBM InfoSphere BigInsights, Cloudera, Apache and Hortonworks Hadoop Distributed File System (HDFS).
  • Offers Balanced Optimization for Hadoop capabilities to push processing to the data and improve efficiency.
  • Supports big-data governance including features such as impact analysis and data lineage

Powerful, scalable ETL platform

  • Manages data arriving in near real-time as well as data received on a periodic or scheduled basis.

  • Provides high-performance processing of very large data volumes.

  • Leverages the parallel processing capabilities of multiprocessor hardware platforms to help you manage growing data volumes and shrinking batch windows.

  • Supports heterogeneous data sources and targets in a single job including text files, XML, ERP systems, most databases (including partitioned databases), web services, and business intelligence tools.

Near real-time data integration

  • Captures messages from Message Oriented Middleware (MOM) queues using Java Message Services (JMS) or WebSphere MQ adapters, allowing you to combine data into conforming operational and historical analysis perspectives.

  • Provides a service-oriented architecture (SOA) for publishing data integration logic as shared services that can be reused over the enterprise.

  • Can simultaneously support high-speed, high reliability requirements of transactional processing and the large volume bulk data requirements of batch processing.

Ease of use

  • Includes an operations console and interactive debugger for parallel jobs to help you enhance productivity and accelerate problem resolution.

  • Helps reduce the development and maintenance cycle for data integration projects by simplifying administration and maximizing development resources.

  • Offers operational intelligence capabilities, smart management of metadata and metadata imports, and parallel debugging capabilities to help enhance productivity when working with partitioned data.

609 questions
0
votes
1 answer

Extract the SQL query that was run in the Oracle connector from a DataStage job in order to add it to a file/table

I have a parallel DataStage job that uses a particular SQL query with some parameters. Once the job is running, I can see in the Director log the exact SQL query that was triggered on the database. My question is: is there any way I can get this SQL…
Criss
  • 7
  • 4
0
votes
1 answer

How to solve oracle connector error in DataStage?

I am facing a sudden issue in my datastage job. when run the job it eventually aborts/fails and gives below error. whereas this was not happening earlier, I ran the job multiple times and it completed successfully.I have not made any changes. I am…
Salva
  • 81
  • 1
  • 9
0
votes
1 answer

Is there documentation on Datastage Designer/Directory commandline options?

I connect to several different environments (services tier: dev, test, staging, prod) with several different usernames and several different projects. I have figured out that running {dsdesign,director}.exe host.com/projectname will fill the 'Attach…
harleypig
  • 1,264
  • 8
  • 25
0
votes
2 answers

How to capture the Null values(custname) and respective CustID in separate file and the rest of the CustID's in other file

CustID CustNAme 10 Ally 20 null 30 null 40 Liza 50 null 60 Mark
Roopa
  • 1
  • 2
0
votes
2 answers

Datastage Sequence job- how to process each file at a time if those files are in 7 different folders

DataStage - There are 7 folders in a path and in each folder there are 2 files . for eg : the 2 files are in the folllowing format- filename = test_s1_YYYYMMDD.txt, test_s1_YYYYMMDD.done. The path for these files are…
sreerag
  • 11
  • 2
0
votes
1 answer

How to Use the Customize window to add third party applications in Datastage

How to Use the Customize window to add third party applications in Datastage? My team is doing a research on how to integrate Datastage with bitbucket and wants to know how the Custom feature in Datastage can be used for this. Appreciate if anyone…
Jojames
  • 11
  • 1
0
votes
2 answers

Do the restartable sequence jobs in datastage also rerun the job activities that were aborted not because of sequence faliure

I want to know that whether the restartable sequence jobs in datastage also rerun the job activities that were aborted but not due to the sequence failure?
Shivam
  • 3
  • 4
0
votes
1 answer

how to use parameterized Query in after sql in datastage?

I have to create a table in DB2 and read the query from file in Before/After Sql tab in Datastage. I am using DB2 connector for this. I have also parameterized the query but getting below error- an unexpected token was found '/'. create table Temp…
Salva
  • 81
  • 1
  • 9
0
votes
1 answer

How can I loop thorough a record one character at a time?

How can I loop thorough a record one character at a time? I want to interrogate records in a ASCII file one character at a time looking for and replacing non-printable characters. I tried using the Loop Condition with no luck. Thanks in advance…
tbtcust
  • 65
  • 6
0
votes
1 answer

Detect how many columns that have come in via RCP

Is there a way to detect how many columns that have come in via RCP? I have a Sequential File stage using RCP. The next stage is a Transformer stage. In the Transformer stage I want to know/detect the total number of columns coming from the…
tbtcust
  • 65
  • 6
0
votes
1 answer

db transaction (commit and rollback) in datastage

Is there a way to implement transaction in job datastage, a way to rollback all upsert when i.e. my job aborts? If not, is there a way or standard practice, a workaround, to simulate commit/rollback sistem? Thanks in advance
sangi
  • 511
  • 3
  • 13
  • 25
0
votes
1 answer

Using the Data Set Stage to read the file as a single record

I have an DataStage Dat Set file and the Data Set has 120 columns. I’m trying to use the Data Set Stage to read records in one field. Is this possible? Thanks in advance for any help.
tbtcust
  • 65
  • 6
0
votes
1 answer

Capturing Reject Records in a File in DataStage

I am trying to create a DataStage job where I want to capture the rejected records in a file. The problem I am facing here is that - when there are no reject records, the reject file still gets created. It is working fine when I have rejected…
Nikhil
  • 57
  • 8
0
votes
1 answer

Using the CFF Stage to read an EBCDIC file as a single record

I have an EBCDIC file created on z/OS and is SFTPed to the midrange/Linux. The EBCDIC file has 20 fields. I’m trying to use the CFF Stage to read records in one field. Is this possible? Thanks in advance for any help. The COBOL copybook has 01 …
tbtcust
  • 65
  • 6
0
votes
1 answer

How to update a source table after copying data to another database in Datastage?

I have a simple ETL job copying data from MS SQL to DB2 using DataStage. I need to update a column in MS SQL, "SenttoDB2" once I have successfully copied the data to DB2. I figured that I just need to create another stage after DB2 and pass the…
user2315860