0

If I have an excel file with rows like this:

val1 | val2 | val3 | val4
val5 | val6 | val7 | val8

then I need the result to be this:

val1 | val2 | val3 | val4
val1 | val2 | val3 | val4
val5 | val6 | val7 | val8
val5 | val6 | val7 | val8

Is this possible with Talend?

EDIT: Notice the order of the rows. I need them to maintain order.

mastercool
  • 463
  • 12
  • 35

2 Answers2

3

For a pure duplication, the easiest would be to use a tHashInput to store the values coming from your Excel file.

Then you can read from a linked tHashOutput twice and join the flows with a tUnite.


If you need to keep the order, you can add a tJavaRow or a tMap before the tHashInput to add a column "order" valued with a sequence. Then you can add a tSortRow after the tUnite and order with the new column. Finally, you delete the extra column with a tFilterColumn (or any other component).

enter image description here

Result :

enter image description here

Code for the order :

Numeric.sequence("s1",1,1);

Note : you might have to add the components tHashOutput and tHashInput to your palette as they are not included by default.

Carassus
  • 132
  • 6
1

Send 2 identical inputs to a tUnite to duplicate the row. Then send the rows to a tSort to sort them.

enter image description here

The 2 tFlowInput are identical, replace them with what you have.

enter image description here

Sync Columns on the tJoin. Set the columns to sort on the tSort

enter image description here

Output :

.---------+----------+----------+----------.
|                tLogRow_1                 |
|=--------+----------+----------+---------=|
|newColumn|newColumn1|newColumn2|newColumn3|
|=--------+----------+----------+---------=|
|val1     |val2      |val3      |val3      |
|val1     |val2      |val3      |val3      |
|val5     |val6      |val7      |val8      |
|val5     |val6      |val7      |val8      |
'---------+----------+----------+----------'
RealHowTo
  • 34,977
  • 11
  • 70
  • 85
  • Depending on the size of the initial file, reading it twice could be quite heavy. Same if they are a lot of columns and you have to sort by each. That said if it's no problem, it is indeed simpler to do it that way. – Carassus Oct 06 '20 at 12:15
  • Do I have to type out all of the columns manually like you do in your second step? @RealHowTo – mastercool Oct 06 '20 at 12:32
  • Because my input is a tFileInputExcel – mastercool Oct 06 '20 at 12:33
  • @mastercool, yes you have to do it at least one time. Make a schema (metadata/generic schema) then you will be able to reuse it in all your component by selecting `schema>Repository>Generic>@your_schema@` – RealHowTo Oct 06 '20 at 13:00
  • dang, unfortunately I have over 100 rows and 20 columns. That would take a very long time – mastercool Oct 06 '20 at 13:12
  • @mastercool, when you defined a schema, you only give the column names, If you create a File Excel schema, Talend can extract the column name for you. – RealHowTo Oct 06 '20 at 13:20
  • Oh gotcha. But then I still manually have to type in the values for each row right? – mastercool Oct 06 '20 at 13:21
  • No. In your tFileInputExcel, you specify ` schema>Repository>` (Generic or File Excel) schema that you have previously defined and you give the excel filename that you want to processed. At the execution, the data read from the file will be mapped to the right column. – RealHowTo Oct 06 '20 at 13:26
  • Haven't tried it yet, but this sounds like it should work for me. Thanks! – mastercool Oct 06 '20 at 13:35