Questions tagged [datastage]

DataStage is the ETL (Extract, Transform, Load) component of the IBM InfoSphere Information Server suite. It allows the user to integrate various data sources and targets in an enterprise environment as a GUI based client tool.

DataStage is the ETL (Extract, Transform, Load) component of the IBM InfoSphere Information Server suite. It allows the user to integrate various data sources and targets in an enterprise environment as a GUI based client tool. Data Sources/Targets could be database tables, flat files, datasets, csv files etc. Basic design paradigm consists of a unit of work called as DataStage job. Multiple jobs can be controlled and conditionally sequenced using 'Sequences'.

IBM® InfoSphere® DataStage® integrates data across multiple systems using a high performance parallel framework, and it supports extended metadata management and enterprise connectivity. The scalable platform provides more flexible integration of all types of data, including big data at rest (Hadoop-based) or in motion (stream-based), on distributed and mainframe platforms.

Read more here

InfoSphere DataStage provides these features and benefits:

  • Powerful, scalable ETL platform
  • Support for big data and Hadoop
  • Near real-time data integration
  • Workload and business rules management
  • Ease of use

Support for big data and Hadoop

  • Includes support for IBM InfoSphere BigInsights, Cloudera, Apache and Hortonworks Hadoop Distributed File System (HDFS).
  • Offers Balanced Optimization for Hadoop capabilities to push processing to the data and improve efficiency.
  • Supports big-data governance including features such as impact analysis and data lineage

Powerful, scalable ETL platform

  • Manages data arriving in near real-time as well as data received on a periodic or scheduled basis.

  • Provides high-performance processing of very large data volumes.

  • Leverages the parallel processing capabilities of multiprocessor hardware platforms to help you manage growing data volumes and shrinking batch windows.

  • Supports heterogeneous data sources and targets in a single job including text files, XML, ERP systems, most databases (including partitioned databases), web services, and business intelligence tools.

Near real-time data integration

  • Captures messages from Message Oriented Middleware (MOM) queues using Java Message Services (JMS) or WebSphere MQ adapters, allowing you to combine data into conforming operational and historical analysis perspectives.

  • Provides a service-oriented architecture (SOA) for publishing data integration logic as shared services that can be reused over the enterprise.

  • Can simultaneously support high-speed, high reliability requirements of transactional processing and the large volume bulk data requirements of batch processing.

Ease of use

  • Includes an operations console and interactive debugger for parallel jobs to help you enhance productivity and accelerate problem resolution.

  • Helps reduce the development and maintenance cycle for data integration projects by simplifying administration and maximizing development resources.

  • Offers operational intelligence capabilities, smart management of metadata and metadata imports, and parallel debugging capabilities to help enhance productivity when working with partitioned data.

609 questions
2
votes
1 answer

Insert into table KUDU by datastage

I am writing to enquire about a problem in my process: I have a Kudu table and when I try to insert by datastage (11.5 or 11.7) a new row where the size is bigger than 500 characters using the Impala JDBC Driver I receive this error: Fatal Error:…
2
votes
1 answer

Steps to generate OAuth2 token in hierarchical stage in Datastage

How to generate OAuth2 token in hierarchical stage in Datastage job. What are the steps to do this?
Rishabh Tyagi
  • 81
  • 1
  • 7
2
votes
1 answer

Assigning an SQL result to a Job Parameter in DataStage

I just started using Datastage (version 11.5) and I am trying to assign the value of a simple SQL query (select max(date_col) from Table) into a Job Parameter so that I can use it as a part of a file produced from the job. Can anyone point out a…
2
votes
1 answer

Reverse engineering DataStage code into Pig (for Hadoop)

I have a landscape of datastage applications which I want to reverse engineer into Pig... Rather than having to write fresh Pig code and try to replicate the datastage functionality. Has anyone had experience of doing something similar? Any tips on…
Steve
  • 21
  • 2
2
votes
2 answers

IBM datastage integration with java

We have datastage jobs and want to use one java class which reads the file and gives some data back. Can someone explain the steps needed to perform this function?
user509755
  • 2,941
  • 10
  • 48
  • 82
2
votes
1 answer

Sparse lookup for tracking used records

I have a scenario where I have claim application table and application claimant table. I need to sparse look up for the application claimant ID from the application claimant table using SSN as the key. Problem is there are multiple application…
blr20
  • 23
  • 7
2
votes
1 answer

DataStage Error: The OCI function OraOCIEnvNIsCreate:OCI_UTF16ID returned status -1

I am trying to preform a simple connection test in DataStage 8.7. I have an Oracle_Connector inside a Parallel job. I know the credentials are good as I can connect with them using something like SQL Developer. However I am seeing the following…
Wes
  • 4,781
  • 7
  • 44
  • 53
2
votes
1 answer

Cross Project Compare option in Data Stage 9.1

There is a utility called Cross Project compare in the Data Stage designer. using the cross project compare utility I can compare two jobs (for eg. two parallel jobs) from different environments (for eg dev vs prod). I wondered if there is any…
dna
  • 483
  • 3
  • 10
  • 32
2
votes
1 answer

Data stage parallel job export options

I am aware that in Datastage the parallel jobs (.pjb) or any other jobs can be exported to .dsx and .isx files. I wondered if I can simply export a .pjb file as is ?
dna
  • 483
  • 3
  • 10
  • 32
2
votes
1 answer

DataStage 11.3 Assembly Editor flash popup

Our organisation is in the process of upgrading from DataStage 9.1 to 11.3. Problem: The DataStage 11.3 Assembly Editor fails to display, and falls over with an error. Backend OS: Red Hat Enterprise Linux Server release 6.6 (Santiago) Linux …
Bruce Smith
  • 41
  • 1
  • 4
2
votes
3 answers

How to write datastage performance stats on a DB2 table?

My DataStage version is 8.5. I have to populate a table in DB2 with the datastage performance data, something like job_name, start_time, finish_time and execution_date. There is a master sequence with A LOT of jobs. The sequence itself runs once a…
LeandroHumb
  • 843
  • 8
  • 23
2
votes
0 answers

Unlock DataStage job

i am planning disconnect the user session(to unlock job) through a Unix script , provided a session id as input. Here is the manual procedure i found from IBM website. IBM procedure starts http://www-01.ibm.com/support/docview.wss?uid=swg21439971 In…
user3686069
  • 91
  • 6
  • 15
2
votes
4 answers

Run database query in datastage with no inputs or outputs

Relatively new to datastage, quite possibly a stupid question. From datastage, I want to run a database query against a SQL Server database. The query is a delete query with a hardcoded WHERE clause (not my decision). What I cannot figure out is…
James Dean
  • 763
  • 1
  • 11
  • 22
2
votes
4 answers

After insert trigger - SQL Server 2008

I have data coming in from datastage that is being put in our SQL Server 2008 database in a table: stg_table_outside_data. The ourside source is putting the data into that table every morning. I want to move the data from stg_table_outside_data to…
Azzna
  • 77
  • 2
  • 9
2
votes
1 answer

Dumping dataset (.ds) file contents to a text file

At work we use DataStage which uses dataset (.ds) files. I can view the contents of the file from without our UNIX environment by using: orchadmin dump -name This only dumps the contents of the file to the screen. What I would like…
mlevit
  • 2,676
  • 10
  • 42
  • 50
1
2
3
40 41