Questions tagged [pdi]

PDI Pentaho’s Data Integration, also known as Kettle, provides extraction, transformation, and loading (ETL) capabilities.

PDI (Pentaho Data Integration), formally known as Kettle, is a project of data integration. It delivers powerful Extraction, Transformation, and Loading (ETL) capabilities, using a groundbreaking, metadata-driven approach.

External Links:

440 questions
1
vote
1 answer

PDI - Can I implement Exist logic in pentaho? Or is there any way?

I wonder, is there any way to implement exist logic using PDI like in Query Script below EXIST (SELECT a.product FROM store a, struck b WHERE b.product = a.product) to check is the data exist in files (CSV)? I know PDI has provide design tools like…
Rio Odestila
  • 125
  • 2
  • 19
1
vote
3 answers

Pentaho - move files on FTP not working

I have a job that has to move a CSV file to a "processed" folder on the FTP server with date and timestamp attached to the file name. I have the following job right now: In the "Move Files" option my source and destination addresses are of the…
Anand Srinivasan
  • 450
  • 5
  • 20
1
vote
2 answers

IsNull() function not recognized in pentaho modified java script

I am trying to assign a value to and output field based on an input field inside the modified Javascript step. I have coded as: if ( !(person_id.isNull()) ) person_nm = substr(another_field,1,10) else person_nm = ""; When I run this I get the…
KKISHORE
  • 55
  • 1
  • 1
  • 6
1
vote
1 answer

What is the scope of result rows in PDI Kettle?

Working with result rows in kettle is the only way to pass lists internally in the program. But how does this work exactly? This topic has not been well documented and there's a lot of questions. For example, a job containing 2 transformation can…
Phoenexus
  • 15
  • 4
1
vote
1 answer

Pentaho DI - Getting all files from a folder where name come from previous step

Using Get File Names component in kettle (Pentaho Data Integration), I'm trying to get all files names (regardless of extension) from a folder which name come from a previous step's filed, let's say folderName. I'm not getting any content from the…
mokolop
  • 11
  • 1
  • 3
1
vote
1 answer

Filter rows based on a field and create csvs on the filetred result set

I have a table tb_rawcsvdata: with columns plant, employeenumber, term_dt. I need to create a csv file per unique plant using pentaho. What i did was create a transformation which fetches unique plants and puts in a resultset. Then i create…
Balaji
  • 150
  • 4
  • 16
1
vote
1 answer

What is the difference between {$Internal.Transformation.Filename.Directory} and ${Internal.Entry.Current.Directory}?

In order to use as a relative path name in a transformation, which one to use? ${Internal.Transformation.Filename.Directory} or ${Internal.Entry.Current.Directory} What is the difference?
Nae
  • 14,209
  • 7
  • 52
  • 79
1
vote
1 answer

Kettle PDI: better lookup and insert update or insert update + lookup

In Kettle, aka Pentaho Data Integration, I read an xls with some products linked to some categories and I insert them in a db. The relationship category-product is 1:n (one category has more products, one product is of one category). I do the insert…
Daniele Licitra
  • 1,520
  • 21
  • 45
1
vote
1 answer

Opening transformation in PDI from another version renders blank screen

I'm trying to use Spoon 7.1 (a.k.a. Kettle, PDI, Pentaho Data Integration) to open transformations and jobs exported from a previous version (3.2.0). However, if I try to either import, drag and drop or open the file, I end up with a new tab and a…
Renato Back
  • 706
  • 1
  • 10
  • 21
1
vote
1 answer

Count Filtered Rows in Kettle Transformation

Probably, there is an easy solution -- I just can't find it: In a kettle transformation rows are read from some DB table, passed on through some filtering steps and finally written to some other DB table. The filtering steps eliminate non-matching…
SirZ
  • 13
  • 4
1
vote
1 answer

Pentaho Data Integration: Get Auth Code from Oauth2

first of all, i´ve searched for an existing topic but couldn´t find one. Here´s my problem: I´m working on a PDI (Pentaho Data Integration) Transformation, which should get data from the Google Search Console v3 API. A URL which gets the…
Woody
  • 23
  • 1
  • 8
1
vote
2 answers

How to Pass parameter in table input?

I have one job with two transformation in it. Transformation get list of data which is pass to another transformation. Here it execute for each row pass from first transformation. In second transformation I have used "get row from result" ->…
Sudha Bisht
  • 1,525
  • 3
  • 11
  • 12
1
vote
1 answer

Adding "pdi-google-spreadsheet-plugin-master" to pentaho's Kettle Spoon

I want to add plugin "pdi-google-spreadsheet-plugin-master" to pentaho's Kettle Spoon. I have download "pdi-google-spreadsheet-plugin-master" and unziped it to "C:\Pentaho\data-integration\plugins" but I don't know from where I can use this plugin…
Ethan
  • 35
  • 1
  • 8
1
vote
2 answers

Pentaho PDI/ Kettle read multiple lines from text file

I have a SQL file with multiple SQL statements and I need to read them from a text file using Kettle / Pentaho PDI 6.1.0. All the statements are separated using a semicolon, however each statement may be spanned across multiple lines: CREATE TABLE…
Carlos Sousa
  • 75
  • 1
  • 7
1
vote
4 answers

error org.eclipse.swt.SWTError: No more handles [gtk_init_check() failed] on centos 7

I am new in centos 7, I Install Pentaho PDI 7 and run ./spoon.sh in centos 7 and this error pop Up : org.eclipse.swt.SWTError: No more handles [gtk_init_check() failed] at org.eclipse.swt.SWT.error(Unknown Source) at…
Dian.Y
  • 105
  • 1
  • 5
  • 15