5

I have a situation where i need to run five different child jobs in talend in parallel. Problem is that, in my select query i would be getting five different ID's and then for each particular id , i need to run five different jobs. Problem with tparrallelize component is that , it does not allow me to pass context variables to each sub job, i.e id in this particular case.

select id from table limit 5; ----> five different instance of same job with different id as parameter

Any help would be highly appreciated

thanks

I Bajwa PHD
  • 1,708
  • 1
  • 20
  • 42
Nitish Sharma
  • 331
  • 4
  • 7

1 Answers1

4

I'm not sure if I properly understand what you're doing here but if you were to break out each of those IDs and store them as 5 separate context variables then each job could access their own context variable with the right ID stored for each of them and use that.

So I would start with your database input component (just select the IDs you want) and feed that into a tFlowToIterate. Connect this via an iterate flow into a tFixedFlowInput component and create 2 fields in your schema, "key" and "value". Use the inline table to specify that "key" should be ((Integer)globalMap.get("tFlowToIterate_1_CURRENT_ITERATION")) and "value" should be ((String)globalMap.get("row1.SupplierPartNumber")).

Iterate through the returned IDs and put in Global Map and then retrieve

I'd then throw this into a tMap component where I'd put "ContextNumber" + row2.key into the mapped key column just to make it a bit more obvious than the iteration number as your context and then feed that directly into a tContextLoad.

Map the iteration value

From there you can OnSubjobOK to your tParallelize component and link all your jobs together. In each job configure the jobs to use the appropriate context variable.

enter image description here

ydaetskcoR
  • 53,225
  • 8
  • 158
  • 177
  • 1
    A much simpler solution is to point the iterate on the tRunJob (set how many threads you want to have, then pass the context parameter to the childjob from row1.param1, This way you can set single, double, multithread. – Balazs Gunics Jan 25 '14 at 09:35
  • @BalazsGunics If we do not know already how many threads we want to have (number of tuples selected from the data source), is there a way to use the parallel execution feature of the iterate link? – Raphael Royer-Rivard Jul 09 '14 at 20:26
  • you can do something like: "select id from table limit " + context.threadCount then use the same context.threadCount on the Iterate link :) – Balazs Gunics Jul 10 '14 at 18:55
  • I connected tParallelize to five tLoops. I want to perform more stuff after the parallelization work. How do I place a connecting OnSubJobOk? – pitchblack408 Sep 07 '18 at 00:05
  • @pitchblack408 you should ask a new question, linking to this one if you feel like it's appropriate. I haven't touched Talend in years so not the right person to answer anymore either. – ydaetskcoR Sep 07 '18 at 06:53
  • https://stackoverflow.com/questions/52213647/i-have-to-perform-more-stuff-after-the-parallelization-work-using-talend-studio/52218047#52218047 – pitchblack408 Oct 26 '18 at 20:59