1

What's the most proper and best practice driven way of configuration my transformations?

In other words let's imagine I have a big ETL solution based on kettle that does stuff by connecting to different data source, I would like to store these data sources in a centralized location and have each transformation look it up everytime it needs to connect somewhere.

In SSIS there is package configuration what is the alternative that I have with pentaho?

Ps: I do not want to install any 3rd party framework.

Thank you

CoolStraw
  • 5,282
  • 8
  • 42
  • 64

1 Answers1

3

This can be done in various ways.

  1. Parameterising the database connections, and configuring the properties via kettle.properties. You could still access that kettle.properties from a shared area or something.

  2. As above, but configuring the connections by reading credentials from a database. Has to be hand crafted, but can be made to work with some caveats.

  3. If you use the repository, then the database connections are stored centrally anyway. So if you have a dev and a prd repo, when you promote, dont promote the db connection itself. Trickier than it sounds though.

As for all of that, the new 4.4(?) release should have proper lifecycle management to make dealing with all this stuff a lot easier!

Codek
  • 5,114
  • 3
  • 24
  • 38
  • Meaning I don't have this yet out of the box, that's too bad. Thanks – CoolStraw Jun 07 '12 at 07:14
  • Dont let that put you off though, Kettle is incredibly powerful, and i've certainly found it to be a very valuable tool. – Codek Jun 08 '12 at 08:29
  • I am not off at all :) just would have loved to have something outta box since it's like a standard thing; dynamic configuration.. :) thx – CoolStraw Jun 08 '12 at 09:01