Questions tagged [pentaho-data-integration]

Tag to be used for Pentaho Data Integration (all versions). Pentaho Data Integration prepares and blends data to create a complete picture of your business that drives actionable insights.

Pentaho Data Integration prepares and blends data to create a complete picture of your business that drives actionable insights.

It provides intuitive drag-and-drop data integration coupled with data agnostic connectivity spanning from flat files and RDBMS to Hadoop and beyond.

Features:

  • Graphical extract-transform-load (ETL) designer to simplify the creation of data pipelines
  • Rich library of pre-built components to access, prepare, and blend data from relational sources, big data stores, enterprise applications, and more
  • Powerful orchestration capabilities to coordinate and combine transformations, including notifications and alerts
  • Agile views for modeling and visualizing data on the fly during the data preparation process
  • Integrated enterprise scheduler for coordinating workflows and debugger for testing and tuning job execution
825 questions
0
votes
1 answer

I want to improve pentaho performace for data loading

I have 4 million records that needs daily load of data from source to target and we are doing truncate everyday. It takes like 9 hours as there are like 10 tables doing 4 million records data loading every day. Could you please tell me how do i…
Ujjwal Chowdary
  • 125
  • 2
  • 5
  • 18
0
votes
3 answers

Pentaho | Centos

I am working in Pentaho Data Integretion. We have developed the transformations and Job in spoon. We want to move our code on server and server is Centos. In Centos, we are getting errors while installing UI of Pentaho. We are able to install…
0
votes
2 answers

Pentaho:- How to run .kjb files in PHP

I have a below requirement. We want to create one PHP page and want to run Pentaho .kjb files from PHP Page. If we click on RUN button, then PHP should make a call to Pentaho and then .kjb files should execute. Can someone guide how to achieve this?
0
votes
1 answer

Duplicating row in database with list in input

I want to duplicate row in database input based on list. Input: I have JSON string which currently getting sorted in db as per fields. "product": [{"startDate": "2015-02-01T00:00:00Z", "modifiedOn": "2015-03-17T14:12:46.758Z", "parts": ["65",…
0
votes
2 answers

Pentaho Data Integration dynamic connection (read connection from database)

Pentaho Data Integration: CE 6.1.0.1-196 I am newbie in Pentaho Data Integration. I need to run the same query in multiple databases. I created a table in the master database to store the connection information from other databases that need to be…
0
votes
0 answers

Pentaho mail :- Do not want to display attachment data in mail body.Just want to dispay static message

Good After noon, I am using Mail step of Pentaho DI. I am able to receive mail with attachment files. But I can see those attachment files data and info. in mail body which I do not want. Can anyone suggest me how to NOT include attachment content…
Nilesh Patil
  • 91
  • 2
  • 11
0
votes
1 answer

Data Types and Indexes

Is there some sort of performance difference for inserting, updating, or deleting data when you use the TEXT data type? I went here and found this: Tip: There is no performance difference among these three types, apart from increased storage…
0
votes
1 answer

Pentaho:-CSV file input

I am very new to Pentaho DI. My requirement:- In my CSV file input step, I don't want to select files from browser. I want to pass it through variable or dynamic way. Let say. I have file in "Download" Folder and daily file names get change. So,in…
Nilesh Patil
  • 91
  • 2
  • 11
0
votes
0 answers

"USE DB TO GET SEQUENCE" step is not working in ADD SEQUENCE in pentaho

"USE DB TO GET SEQUENCE" step is not working in ADD SEQUENCE in Pentaho. Even if the connection and schema is correct and sequence is available the 'SEQUENCE NAME' showing no sequence is available. Is there any other way to get the max value of an…
Ms13
  • 151
  • 2
  • 2
  • 14
0
votes
0 answers

Hive connectivity error in Pentaho DI

I am getting a below error while I am trying to do a test connection on hive localhost in pentaho di. Error connecting to database [HiveConn] : org.pentaho.di.core.exception.KettleDatabaseException: Error occurred while > trying to connect to the…
Naveen Kumar
  • 582
  • 2
  • 8
  • 25
0
votes
2 answers

Strange error producing by Kettle

It is only a simplest tran developed in Kettle 5.4. The tran extracts data from MSSQL and insert into another MSSQL without any other operations. I enabled the "Use batch update for inserts" check box. Usually it runs successfully with any error…
Janus
  • 13
  • 1
  • 3
0
votes
0 answers

Pentaho Kettle (Spoon) - Delete Records

I'm trying to delete records in my target table based on whether the records exists in the source table. I tried using the 'Delete' step but then realized that this step is based on a conditional clause. My condition is quite simple "if the…
0
votes
1 answer

Pentaho Kettle - Retrieving Data from different database

I have a scenario where I'm fetching data from one database(postgres) and loading the data into a table in a different database(Redshift) Is there anyway in Kettle to schedule this job ? Its a simple insert into redshift select * from postgres
0
votes
1 answer

Pentaho DI - How to use "all" results from prior step in the next step as an "IN" query

I have input from a tableA in database A that I would like to join to another tableB in database B. These were my two options: Use Database Join: For each input from table in database A, run the join query in database B. Use two Input tables…
SRS
  • 15
  • 1
  • 6
0
votes
1 answer

Spoon connectivity issue to PostgreSql

Error message is thrown while connecting to database from Spoon. Selected the View option that appears in the upper-left corner of the screen, right-clicked on the Database connections option, and selected New. Under Connection Type, selected the…
Dex
  • 388
  • 5
  • 31