Questions tagged [pdi]

PDI Pentaho’s Data Integration, also known as Kettle, provides extraction, transformation, and loading (ETL) capabilities.

PDI (Pentaho Data Integration), formally known as Kettle, is a project of data integration. It delivers powerful Extraction, Transformation, and Loading (ETL) capabilities, using a groundbreaking, metadata-driven approach.

External Links:

440 questions
1
vote
1 answer

Pentaho Data Integration - Kafka Consumer

I am using the Kafka Consumer Plugin for Pentaho CE and would appreciate your help in its usage. I would like to know if any of you were in a situation where pentaho failed and you lost any messages (based on the official docs there's no way to read…
André De La O Campos
  • 1,027
  • 1
  • 8
  • 19
1
vote
2 answers

How to deal with 1 to many SQL (Table inputs) in Pentaho Kettle

I have a situation where in i have the following tables. Employee - emp_id, emp_name, emp_address Employee_assets - emp_id(FK), asset_id, asset_name (1-many for employee) Employee_family_members - emp_id(FK), fm_name, fm_relationship (1-many for…
Sushant kunal
  • 341
  • 1
  • 2
  • 9
1
vote
2 answers

How to mask selected values in a json field - Postgresql 9.3 and PDI

I have input field change_event (json datatype) which looks something like [ { "fieldName":"address", "oldValue":{ "addressLine1":"36 ABC St", "addressLine2":"Suite…
DUnkn0wn1
  • 401
  • 1
  • 8
  • 23
1
vote
2 answers

Passing data from one Pentaho transformation to another in a job?

Fairly straightforward question I think, I just haven't been able to find a clear example. I have a very complex transformation that I'm breaking down into a job. Having never created a job before, I'm struggling to send the data from one…
Shaun Johnson
  • 21
  • 1
  • 2
1
vote
2 answers

Pentaho PDI (Spoon): MySQL table output very slow (~2000 rows/s)

My table output step is terribly slow (~2.000 rows/second), compared to the input (100.000-200.000 rows/second). The MySQL server is not the problem, using native MySQL, e.g. with the "Execute SQL script" step, I get something in the…
Juergen
  • 312
  • 3
  • 18
1
vote
0 answers

Pentaho PDI: row listener for remote transformations

I already know how to implement a Row Listener for a local transformation using Java (http://wiki.pentaho.com/display/EAI/Executing+a+PDI+transformation). Since not the Spoon UI nor the Carte API provide any mechanism for continuous preview of data…
Claudio
  • 10,614
  • 4
  • 31
  • 71
1
vote
1 answer

Pentaho PDI Repository connection

Can you explain the difference between different types of repositories in Pentaho PDI and whats the use of having these different repositories? What is benifit of JNDI and OCI database connection wizard and how to configure these two? Thanks for…
suraj08
  • 119
  • 2
  • 11
1
vote
1 answer

Get specifics values from column to create new lines

I have a problem use Kettle/PDI. I need did columns to separate specifics values. But I need to create new lines with just this values, one row for each column. I tried to use row normaliser but it's didn't work. Please can someone help me? Thanks…
thiagofred
  • 197
  • 10
1
vote
2 answers

Transformation taking too long to execute

with the reference to my previous post,here is the link I have 130000 records in my source. When I tried running the transformation it was still running after 16 hours. Will increasing the memory heap of spoon.bat script file help reduce the…
Deepesh
  • 820
  • 1
  • 14
  • 32
1
vote
1 answer

Dynamic XML parsing in Pentaho Kettle

I have been using STAX parser for quite long in pentaho Kettle. But suddenly I got a situation which is weird. Earlier the XML files were having pre-defined levels like :
Vikas Kumar
  • 87
  • 2
  • 18
1
vote
1 answer

Applying Pivot in Pentaho Kettle

I'm using pentaho kettle 5.2.0 version. I'm trying to do pivots on my source data,here it is the structure of my source Billingid sku_id qty 1 0 1 1 0 12 1 0 6 1 0 1 …
Deepesh
  • 820
  • 1
  • 14
  • 32
1
vote
2 answers

error to pass data from pentaho HTTP Client to Json Input

I am trying to load json data from the following link http://swapi.co/api/films/ into pentaho. I used 3 steps Generate Rows, HTTP Client and Json Input Generate Rows Step : Limit: 1 Name: movies Type:…
tottihope
  • 69
  • 1
  • 11
1
vote
1 answer

Parsing JSON file for PDI

I am trying to process some uneven JSON files using PDI (Pentaho) and after trying a lot with the native tools, I figured out that I need to parse the JSON files before they are processed. This is an example for just two rows: [{ "UID":…
1
vote
2 answers

Datatype conversion error in pentaho

Hi I have a table column named "sku" which of type integer and another column "total_sku" which again is of type integer , when I'm trying to calculate the percentage(100 * sku/total_sku) using Calculator step. I'm expecting an integer but its…
Deepesh
  • 820
  • 1
  • 14
  • 32
1
vote
0 answers

Does Pentaho Data Interation aka Kettle 6.0 support failover clustering?

Can the Pentaho Data Integration server (community edition) support any type of Failover via active/passive or active/active clustering? We are new to Pentaho and want to use it for importing inbound ETL files in continuous production environment.…