0

I am using Talend to do some ETL on some tables from my Database.

I need to do the same tMap operation between the same tables for 3 times, but on different fields.

Since the lookup table is big (100 Milion records), I am wondering if there is some way to load it just once and use the same lookup table for the 3 different tMap components.

Thanks.

Valerio Storch
  • 301
  • 1
  • 3
  • 11

1 Answers1

3

You can read the table and write it to a tHashOutput component and use tHashInput to read the data from tHashOutput.

Below is a simple job design,

enter image description here

Since I do not have any database connection, I am using some static input from tFixedFlowInput. Below is the input data that I am using

enter image description here

  • I am storing it in tHashOutput_1 component.
  • Then I am reading the same data available in tHashOutput_1 using three tHashInput components.
  • In tMap component, I am joining with different fields like below

enter image description here

Using this approach your problem would be solved.

Note: If you could not able to find the tHashOutput and tHashInput component in your palette, then you can follow this steps

Viki888
  • 2,686
  • 2
  • 13
  • 16
  • Hi. I tried your solution but it seems not to work. The lookup tables keep on loading separately one after each other. ![image](http://imgur.com/HDzTBqM) – Valerio Storch Apr 20 '17 at 09:10
  • Why does the row count varies between two `tHashInput` components? – Viki888 Apr 20 '17 at 09:16
  • The rows are different because it is still running, in fact the line is blue. – Valerio Storch Apr 20 '17 at 09:19
  • No. My question is, you have saved the data in `tHashOutput` component. Then if you read the same data in `tHashInput` components, then the number of rows need to be same. But in your case, `tHashInput_1` has same number of rows from `tHashOutput_1` component. And `tHashInput_2` is not having the same number of rows from `tHashOutput` component. – Viki888 Apr 20 '17 at 09:21
  • Sorry, I don't understand your point. **tHashInput_1** completed the loading of all the rows, that are **20782613**, exactly the same as **tHashOutput**. **tHashInput_2** was still loading at the time of the caption, and thus the number is different. – Valerio Storch Apr 20 '17 at 09:25
  • Oh ok Fine. I misunderstood. So what is the issue now? – Viki888 Apr 20 '17 at 10:02
  • It seems like the loockups are reloading, when I run the job. Hower in this way they are much more faster, so I used it. – Valerio Storch Apr 28 '17 at 16:09
  • If it solves your problem, kindly accept as correct answer, which may help others. – Viki888 May 17 '17 at 15:09
  • Done! Sorry I forgot it ;) – Valerio Storch May 22 '17 at 14:42