0

Facing problem and out of ideas on figuring on how to implement parent-child relationship in Talend.

Problem Statement:

Having a feed file which has data in below format

MemberCode|LastName|FirstName
A|SHINE|MICHAEL 
B|SHINE|MICHELLE 
C|SHINE|ERIN 
A|RODRIGUEZ|DAMIAN 
A|PAVELSKY|STEPHEN        
B|PAVELSKY|TERESA

(there are many more columns and many more rows - just few rows for reference purpose). LastName and FirstName are self-explanatory. MemberCode denotes the relationship. A will be parent, B or C will be child. For a certain employee record the data will always be in sequential manner - meaning the complete parent-child data will be in continuous rows.

Expected Result:

The above data needs to be outputed in below format:

  MemberCode|MemberLastName|MemberFirstName|DependentLastName|DependentFirstName
A         |SHINE         |MICHAEL        |                 |                  
B         |SHINE         |MICHAEL        |SHINE            |MICHELLE          
C         |SHINE         |MICHAEL        |SHINE            |ERIN              
A         |RODRIGUEZ     |DAMIAN         |                 |                  
A         |PAVELSKY      |STEPHEN        |                 |                  
B         |PAVELSKY      |STEPHEN        |PAVELSKY         |TERESA            

What I have tried so far:

The Talend job is having these components: tFileInputDelimited->tMap->tLogRow And tMap has the below logic - enter image description here which gives me output like below -

MemberCode|MemberLastName|MemberFirstName|DependentLastName|DependentFirstName
A         |SHINE         |MICHAEL        |                 |                  
B         |              |               |SHINE            |MICHELLE          
C         |              |               |SHINE            |ERIN              
A         |RODRIGUEZ     |DAMIAN         |                 |                  
A         |PAVELSKY      |STEPHEN        |                 |                  
B         |              |               |PAVELSKY         |TERESA

How to replicate the value for MemberFirstName and MemberLastName for MemberCode A for the rows having MemberCode B or C. Thanks in advance.

Platform: Talend Open Studio for Data Integration Version: 6.5.1

Community
  • 1
  • 1
Abhishek
  • 2,482
  • 1
  • 21
  • 32

2 Answers2

1

Here's the solution I put together:

enter image description here You need to split your rows into parents and children based on their MemberCode. You write the parents to file with DependentLastName and DependentFirstName being empty, while saving the parent info to global variables (ParentLastName and ParentFirstName) in a tSetGlobalVar.

When you move to the next row, which is a child row, your parent has already been saved as it's always the first in the group. So you can retrieve its first and last name using the global variables in the children output, and write this to the same physical file.

Both tFileOutputDelimited components have identical settings; they are in append mode, and have the option Custom the flush buffer size set to 1 (this is important in order to keep the rows sorted in the right order).

Ibrahim Mezouar
  • 3,981
  • 1
  • 18
  • 22
0

The solution provided by @iMezouar works just fine. Posting another alternative way.

Job Layout:

enter image description here

The approach used was to capture the previous row values (LastName & FirstName) and store them in variables inside tMap and then use them to the output row.

enter image description here

Abhishek
  • 2,482
  • 1
  • 21
  • 32