Questions tagged [pdi]

PDI Pentaho’s Data Integration, also known as Kettle, provides extraction, transformation, and loading (ETL) capabilities.

PDI (Pentaho Data Integration), formally known as Kettle, is a project of data integration. It delivers powerful Extraction, Transformation, and Loading (ETL) capabilities, using a groundbreaking, metadata-driven approach.

External Links:

440 questions
2
votes
1 answer

Creating a JSON structure in PDI without blocks

I'm trying to get a simple JSON output value in PDI from a field that was defined in an earlier step. The field is id_trans, and I want the result to look like {"id_trans":"1A"} when id_trans value is 1A. However, when using the JSON Output step…
cdel
  • 21
  • 5
2
votes
1 answer

PDI - How to monitor kettle Transformation and Jobs?

I'm try to create web app to monitor my transformation and job. I will show all the status (begin datetime, run time duration, finish datetime, status, etc) on web app in live(my web app will refresh automatically to check the status). Is there any…
Rio Odestila
  • 125
  • 2
  • 19
2
votes
2 answers

Pentaho and Hadoop

I am sorry if this question seems naive, But I am new to Data engineering field, as I am self learner right now, however my questions is what is the differences between ETL products like Pentaho and Hadoop? when I use this instead of that? or I may…
user6514731
2
votes
1 answer

Generate 1 Excel-file with two tabs from two input file in Pentaho

I am trying to develop a job that is able to generate 1 Excel-file with two tabs. Basically, what I want to achieve is: tab 1 is based on input file 1 tab 2 is based on input file 2 I have two input file which contain different query, but end result…
2
votes
1 answer

Pentaho Kettle connect to Hadoop Cluster

I'm trying to connect to Hadoop Cluster running on a Linux system using Pentaho Data Integration (Kettle) which is running on Windows 10. While testing the connection I receive the following error: "Hadoop File System Connection - Unable to connect…
aveek
  • 188
  • 6
2
votes
1 answer

Pentaho table input step goes to idle state

I have a similar table structure in two different schema of MySQL. I am trying to get the data from one schema using table input and inserting the same data to a different schema using insert / update step. When I run the ktr, it is going to idle…
Subhrajit
  • 55
  • 1
  • 11
2
votes
1 answer

How to pass conditioned date parameters to Table input step?

I need to base my SQL SELECTs in several transformations in a job on date parameters and I'm having problems to make it work. The plan is: to have a begin_date and a end_date parametrized on the following condition: if it's 1st of any month, the…
Daniel Souza
  • 430
  • 4
  • 11
2
votes
1 answer

PDI/Kettle: avoid file creation or mapping (sub-transformation) execution

It's clear by now that all steps from a transformation are executed in parallel and there's no way to change this behavior in Pentaho. Given that, we have a scenario with a switch task that checks a specific field (read from a filename) and decides…
jfneis
  • 2,139
  • 18
  • 31
2
votes
1 answer

Unexpected Error when using JSon URL input in spoon

I am trying to import a json in spoon. It works just fine with a file .json but when I try it from a URL I get the Unexpected Error, followed by the java null pointer exception, when executing the transformation. I get the same error with "JSON…
Wallee
  • 21
  • 5
2
votes
1 answer

PDI: Simultaneous Unwind of two arrays from MongoDB

Within Spoon I have used the mongoDB Input step. for a given document of form.. {"Number": [ "4700100004" ], "Random": [ "unknown" ], "List_Of_Vals1": [ "3", "2", "1", ], "List_Of_Vals2": [ "1", "2", "3", ]} I am…
Tyler Cowan
  • 820
  • 4
  • 13
  • 35
2
votes
1 answer

How to extract an email Attachment with Pentaho Data Integration?

let me start with what I want to accomplish: I get 20 email with reports from clients daily, and I had to extract the .xls files attached for each one and do some simple transformations depending on who sent the file. With Pentaho Data Integration,…
IvanVarela
  • 71
  • 3
  • 6
2
votes
1 answer

Pivots using ETL Metadata Injection

It's quite simple to use row denormaliser to achieve pivots when we have few records which can be written manually in denorm step,but what when there's hundreds of thousands of records? I tried using etl metadata injection step, but I was unable to…
Deepesh
  • 820
  • 1
  • 14
  • 32
2
votes
0 answers

Caused by: java.lang.ClassNotFoundException: org.pentaho.reporting.libraries.formula.FormulaContext

I am invoking a Pentaho Kettle Job using java. The transformations were developed using PDI CE version 5.4.0.1-130 I have added the below maven dependencies in pom.xml: pentaho-kettle
Bhavya Ben
  • 21
  • 2
2
votes
1 answer

PDI's "Replace in String" step with regex doing replace twice

What I'm trying to do is just concatenate arguments to my existing string "./executable.sh", so that output row set would look like that ./executable.sh argument1 ./executable.sh argument3 ./executable.sh argument2 ... ... Below is the…
ilya_i
  • 333
  • 5
  • 14
2
votes
2 answers

Pentaho Data Integration Input / Output Bit Type Error

I am using Pentaho Data Integration for numerous projects at work. We predominantly use Postgres for our database's. One of our older tables has two columns that are set to type bit(1) to store 0 for false and 1 for true. My task is to synchronize a…
John B
  • 159
  • 3
  • 14
1 2
3
29 30