4

I want to ask a question about Pentaho. I'm using Pentaho data Integration (Kettle). I want to use parameter in the select statement. I have searched various forums and didn't find any solution. My problem is that I want to get result from different databases.How can I use parameter to get data from 2 or more databases? I have two transformation.

  1. select catalog_name from information_schema.schemata where catalog_name like "%db"
  2. select data_name from ${catalog_name}.dbo.database_name

There is error that says Syntax error or access violation?

  • if you are specifying alias name, then your query must be like, select schemata.catalog_name from information_schema.schemata where schemata.catalog_name like "%db" or if not then select catalog_name from information_schema where catalog_name like "%db" – Helping Hand.. May 19 '14 at 08:38
  • thanks @WorkingHard.. I'm also having problem with variable name used. – user3651704 May 19 '14 at 09:16

2 Answers2

2

The following solution mimics your task by calling the first SQL statement and using the elements of the result set of that statement as parameter to call a second SQL statement.

Transformation and details of first table input

In the second table input the ? is used as a placeholder for the parameter. Kettle seems to map incoming fields sequentially. In this case the field schema_name is the first field so it will be assigned to the first parameter. Any additional incoming field would be mapped accordingly. Note that the number of incoming fields and the number of used parameters in the SQL statement must match. Otherwise a run-time error will be raised. Also note that the ? in my solution may be specific to the MySQL syntax. From the naming pattern in your SQL statement I would assume that you are using a MS SQL Server. So the actual placeholder may be named differently.

The second table input must be configured as follows:

Configuration Details of table input 2

Note that in this case it is not necessary to mark Replace variables in script since the parameters are not regarded as variables. The entries Insert data from step and Execute for each row, however, must be set appropriately.

Marcus Rickert
  • 4,138
  • 3
  • 24
  • 29
0

It sounds like you've already set things up properly. You have a job that has your first transform for query 1, and sends the output to the job results (row results). You then run transform #2 once for each schema found passing the schema name down as a parameter or variable.

If any of this is incorrect, please specify what it is you're actually doing.

I'm guessing the only thing you missed is checking the Replace variables in script box for the table input in transform 2.

Again if this is not the case, please add more detail.

Brian.D.Myers
  • 2,448
  • 2
  • 20
  • 17