Questions tagged [kettle]

Kettle is a code name for Pentaho Data Integration Community Edition tool. It is an open source GUI-based ETL (Extraction, Transformation, and Loading) tool.

Kettle is a code name for Pentaho Data Integration Community Edition. It is an ETL tool (Extraction, Transformation and Loading) that uses a metadata-driven approach.

https://help.pentaho.com/Documentation/8.2/Products/Data_Integration

1387 questions
4
votes
0 answers

Pentaho Kettle ETL + RabbitMQ plugin (Input & Output)

I am using Kettle for syncing an old relational database running on several clients PCs. After skimming through the book "Enterprise Integration Patterns" I got convinced that I should use a message queue (MQ) as the communication channel between…
4
votes
2 answers

Limit no. of rows in mongodb input

How to limit the no. of rows retrieved in mongodb input transformation used in kettle. I tried in mongodb input query with below queries but none of them are working : {"$query" : {"$limit" : 10}} or {"$limit" : 10} Please let me know where i am…
Deepthi
  • 79
  • 1
  • 7
4
votes
5 answers

Not able to run spoon.bat or any other batch file in Pentaho Data Integration (Kettle)

Tried pdi-ce-4.1.0-stable and pdi-ce-4.2.0-stable My Machine - Windows 7 64 bit When I run Spoon.bat cmd line window appears and disappears and then nothing happens. When I tried to run it from command line, I get DEBUG: Using JAVA_HOME DEBUG:…
utsavanand
  • 329
  • 1
  • 3
  • 15
4
votes
2 answers

Where is Pentaho Kettle's architecture?

Where can I find Pentaho Kettle architecture? I'm looking for a short wiki, design document, blog post, anything to give a good overview on how things work. This question is not meant for specific "how to" starting guides but rather a good view at…
ripper234
  • 222,824
  • 274
  • 634
  • 905
4
votes
2 answers

Kitchen getting killed

I am using pentaho data integration for ETL. I am running the job in ubuntu server as a shell script. It is running for some time after that it is getting killed without throwing any error. Please help me what is the problem and tell me if I am…
Cynosure
  • 161
  • 2
  • 3
  • 9
4
votes
1 answer

kettle mongo _id

I'm trying to upsert by using the _id field in Mongo. I've tried to recover first the _id by using a Json Input step, no luck with $._id or $._id.$oid Anyone knows how to upsert by _id?
4
votes
1 answer

How to do loop in pentaho for getting file names?

I have 100 000 files. I want to get the name of those file names and have to put in database, I have to do like this get 10 files name's; update/insert names into database; and move those 10 files to another directory; and loop these three…
Cynosure
  • 161
  • 2
  • 3
  • 9
3
votes
3 answers

Is there a way to load a Gzipped file from Amazon S3 into Pentaho (PDI / Spoon / Kettle)?

Is there a way to load a Gzipped file from Amazon S3 into Pentaho Data Integration (Spoon)? There is a "Text File Input" that has a Compression attribute that supports Gzip, but this module can't connect to S3 as a source. There is an "S3 CSV Input"…
misterbee
  • 5,142
  • 3
  • 25
  • 34
3
votes
1 answer

Kettle Load csv data into multiple tables

I need to load 2 database tables from a single csv file containing mixed data. I also want to maintain parent child relations using foreign key relation. Below is example of input csv file, ,,<department>,<location> John,Developer,IT,…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/csv" class="post-tag grid--cell" title="show questions tagged 'csv'" rel="tag">csv</a> <a href="../../questions/tagged/kettle" class="post-tag grid--cell" title="show questions tagged 'kettle'" rel="tag">kettle</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Mar 23 '11 at 17:34">asked Mar 23 '11 at 17:34</time> <a href="../../users/673574/sam-keith" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/673574.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="Sam Keith" /> </a> <div class="s-user-card--info"> <a href="../../users/673574/sam-keith" class="s-user-card--link">Sam Keith</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">339</li> <li class="s-award-bling s-award-bling__gold" title="1 gold badge">1</li> <li class="s-award-bling s-award-bling__silver" title="5 silver badge">5</li> <li class="s-award-bling s-award-bling__bronze" title="5 bronze badge">5</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-41260708"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>3</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status answered-accepted"> <strong>2</strong> answers </div> </div> </div> <div class="summary"> <h3><a href="../../questions/41260708/missing-table-input-task-in-kettle-gui" class="question-hyperlink">Missing table input task in kettle GUI</a></h3> <div class="excerpt">I'm using Kettle 7.0. In the design view I'm unable to find table input task. Does this require a plugin? Is it a paid for functionality? </div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/pentaho" class="post-tag grid--cell" title="show questions tagged 'pentaho'" rel="tag">pentaho</a> <a href="../../questions/tagged/kettle" class="post-tag grid--cell" title="show questions tagged 'kettle'" rel="tag">kettle</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Dec 21 '16 at 10:36">asked Dec 21 '16 at 10:36</time> <a href="../../users/1345655/kshitiz-sharma" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/1345655.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="Kshitiz Sharma" /> </a> <div class="s-user-card--info"> <a href="../../users/1345655/kshitiz-sharma" class="s-user-card--link">Kshitiz Sharma</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">17,947</li> <li class="s-award-bling s-award-bling__gold" title="26 gold badges">26</li> <li class="s-award-bling s-award-bling__silver" title="98 silver badges">98</li> <li class="s-award-bling s-award-bling__bronze" title="169 bronze badges">169</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-39877014"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>3</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status "> <strong>2</strong> answers </div> </div> </div> <div class="summary"> <h3><a href="../../questions/39877014/pentaho-kettle-error-writing-to-log-file" class="question-hyperlink">Pentaho Kettle - Error Writing to Log File</a></h3> <div class="excerpt">We have a Pentaho job which is working fine in our local environment but we get an error writing to the log file after deploying it and running the job using Kettle. The error occurs in a job which has the setting 'Execute for every input row?'…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/logging" class="post-tag grid--cell" title="show questions tagged 'logging'" rel="tag">logging</a> <a href="../../questions/tagged/pentaho" class="post-tag grid--cell" title="show questions tagged 'pentaho'" rel="tag">pentaho</a> <a href="../../questions/tagged/kettle" class="post-tag grid--cell" title="show questions tagged 'kettle'" rel="tag">kettle</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Oct 05 '16 at 14:40">asked Oct 05 '16 at 14:40</time> <a href="../../users/801676/jeff-fol" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/801676.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="Jeff Fol" /> </a> <div class="s-user-card--info"> <a href="../../users/801676/jeff-fol" class="s-user-card--link">Jeff Fol</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">1,400</li> <li class="s-award-bling s-award-bling__gold" title="3 gold badges">3</li> <li class="s-award-bling s-award-bling__silver" title="18 silver badges">18</li> <li class="s-award-bling s-award-bling__bronze" title="35 bronze badges">35</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-38682415"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>3</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status "> <strong>1</strong> answer </div> </div> </div> <div class="summary"> <h3><a href="../../questions/38682415/pdi-kettle-how-to-specify-objectid-for-query-match-in-mongodb-output" class="question-hyperlink">PDI Kettle - How to specify ObjectId for query match in MongoDB Output</a></h3> <div class="excerpt">Using PDI Kettle MongoDB Output, I am trying to update a mongodb document, by querying the _id (ObjectId) field. If i pass the _id variable as String to the MongoDB Output step, the final query that gets created looks like Modifier update…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/mongodb" class="post-tag grid--cell" title="show questions tagged 'mongodb'" rel="tag">mongodb</a> <a href="../../questions/tagged/mongodb-query" class="post-tag grid--cell" title="show questions tagged 'mongodb-query'" rel="tag">mongodb-query</a> <a href="../../questions/tagged/kettle" class="post-tag grid--cell" title="show questions tagged 'kettle'" rel="tag">kettle</a> <a href="../../questions/tagged/pdi" class="post-tag grid--cell" title="show questions tagged 'pdi'" rel="tag">pdi</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Jul 31 '16 at 09:04">asked Jul 31 '16 at 09:04</time> <a href="../../users/526614/mahesh" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/526614.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="Mahesh" /> </a> <div class="s-user-card--info"> <a href="../../users/526614/mahesh" class="s-user-card--link">Mahesh</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">123</li> <li class="s-award-bling s-award-bling__gold" title="2 gold badges">2</li> <li class="s-award-bling s-award-bling__silver" title="2 silver badges">2</li> <li class="s-award-bling s-award-bling__bronze" title="7 bronze badges">7</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-38091294"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>3</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status answered-accepted"> <strong>2</strong> answers </div> </div> </div> <div class="summary"> <h3><a href="../../questions/38091294/how-to-create-an-arraylist-object-in-an-user-defined-java-class-in-kettle" class="question-hyperlink">How to create an ArrayList object in an User Defined Java Class in Kettle?</a></h3> <div class="excerpt">I am trying to declare an ArrayList object in an User Defined Java Class object in pentaho kettle. I am trying a simple code inside the User Defined Java Class: import java.util.List; import java.util.ArrayList; List<String> where = new…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/java" class="post-tag grid--cell" title="show questions tagged 'java'" rel="tag">java</a> <a href="../../questions/tagged/pentaho" class="post-tag grid--cell" title="show questions tagged 'pentaho'" rel="tag">pentaho</a> <a href="../../questions/tagged/kettle" class="post-tag grid--cell" title="show questions tagged 'kettle'" rel="tag">kettle</a> <a href="../../questions/tagged/pentaho-spoon" class="post-tag grid--cell" title="show questions tagged 'pentaho-spoon'" rel="tag">pentaho-spoon</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Jun 29 '16 at 05:46">asked Jun 29 '16 at 05:46</time> <a href="../../users/5159284/psr" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/5159284.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="psr" /> </a> <div class="s-user-card--info"> <a href="../../users/5159284/psr" class="s-user-card--link">psr</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">2,619</li> <li class="s-award-bling s-award-bling__gold" title="4 gold badges">4</li> <li class="s-award-bling s-award-bling__silver" title="32 silver badges">32</li> <li class="s-award-bling s-award-bling__bronze" title="57 bronze badges">57</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-38068146"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>3</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status answered-accepted"> <strong>1</strong> answer </div> </div> </div> <div class="summary"> <h3><a href="../../questions/38068146/how-to-get-field-value-in-user-defined-java-class-in-kettle" class="question-hyperlink">How to get Field value in User Defined Java Class in kettle?</a></h3> <div class="excerpt">I am trying to get the Link field in the User Defined Java Class step from my below transformation. Here is the code which I have written in User Defined Java Class: private String link; public boolean processRow(StepMetaInterface smi,…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/pentaho" class="post-tag grid--cell" title="show questions tagged 'pentaho'" rel="tag">pentaho</a> <a href="../../questions/tagged/kettle" class="post-tag grid--cell" title="show questions tagged 'kettle'" rel="tag">kettle</a> <a href="../../questions/tagged/pentaho-spoon" class="post-tag grid--cell" title="show questions tagged 'pentaho-spoon'" rel="tag">pentaho-spoon</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Jun 28 '16 at 06:01">asked Jun 28 '16 at 06:01</time> <a href="../../users/5159284/psr" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/5159284.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="psr" /> </a> <div class="s-user-card--info"> <a href="../../users/5159284/psr" class="s-user-card--link">psr</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">2,619</li> <li class="s-award-bling s-award-bling__gold" title="4 gold badges">4</li> <li class="s-award-bling s-award-bling__silver" title="32 silver badges">32</li> <li class="s-award-bling s-award-bling__bronze" title="57 bronze badges">57</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-37894752"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>3</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status answered-accepted"> <strong>2</strong> answers </div> </div> </div> <div class="summary"> <h3><a href="../../questions/37894752/pentaho-kettle-conversion-from-string-to-integer-number-error" class="question-hyperlink">Pentaho Kettle conversion from String to Integer/Number error</a></h3> <div class="excerpt">I am new to Pentaho Kettle and I am trying to build a simple data transformation (filter, data conversion, etc). But I keep getting errors when reading my CSV data file (whether using CSV File Input or Text File Input). The error is: ... couldn't…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/csv" class="post-tag grid--cell" title="show questions tagged 'csv'" rel="tag">csv</a> <a href="../../questions/tagged/pentaho" class="post-tag grid--cell" title="show questions tagged 'pentaho'" rel="tag">pentaho</a> <a href="../../questions/tagged/etl" class="post-tag grid--cell" title="show questions tagged 'etl'" rel="tag">etl</a> <a href="../../questions/tagged/kettle" class="post-tag grid--cell" title="show questions tagged 'kettle'" rel="tag">kettle</a> <a href="../../questions/tagged/data-integration" class="post-tag grid--cell" title="show questions tagged 'data-integration'" rel="tag">data-integration</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Jun 18 '16 at 08:05">asked Jun 18 '16 at 08:05</time> <a href="../../users/2552108/user2552108" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/2552108.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="user2552108" /> </a> <div class="s-user-card--info"> <a href="../../users/2552108/user2552108" class="s-user-card--link">user2552108</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">1,107</li> <li class="s-award-bling s-award-bling__gold" title="3 gold badges">3</li> <li class="s-award-bling s-award-bling__silver" title="15 silver badges">15</li> <li class="s-award-bling s-award-bling__bronze" title="30 bronze badges">30</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="s-pagination pager fr"> <a class="s-pagination--item" href="../../questions/tagged/kettle_page=4" rel="prev" title="Go to page 4">Prev </a> <a class="s-pagination--item" href="../../questions/tagged/kettle_page=1" rel="" title="Go to page 1">1</a> <a class="s-pagination--item" href="../../questions/tagged/kettle_page=2" rel="" title="Go to page 2">2</a> <a class="s-pagination--item" href="../../questions/tagged/kettle_page=3" rel="" title="Go to page 3">3</a> <div class="s-pagination--item s-pagination--item__clear">…</div> <a class="s-pagination--item" href="../../questions/tagged/kettle_page=92" rel="" title="Go to page 92">92</a> <a class="s-pagination--item" href="../../questions/tagged/kettle_page=93" rel="" title="Go to page 93">93</a> <a class="s-pagination--item" href="../../questions/tagged/kettle_page=6" rel="next" title="Go to page 6"> Next</a> </div> </div> </div> </div> </div> <script src="../../static/js/stack-icons.js"></script> <script src="../../static/js/fromnow.js"></script> </body> </html>