0

I'm creating a job in Talend where I have to generate files containing data generated with tRowGenerator along with other sources : SQL Server database and delimited files.

enter image description here

The issue is that I have duplicated files with the same primary key. All i want to get is 100 records(420 rows) : For each Random UUID generated i shall get 42 rows and so on, but instead i'm getting the same row 10 times(it's duplicated 10 times)

enter image description here

I'm getting data from 3 sources as shown below: enter image description here

To get this fields in my output file: enter image description here enter image description here

MarTech
  • 115
  • 3
  • 12

1 Answers1

1

If I understand correctly, you're using one of the functions in tRowGenerator to get random data.
The problem is that the data generation functions available from Talend are not really random, they get their values from a predefined list of values. You can look at the source code to verify that they have a hundred or so value, so you're bound to get duplicates.
To get unique values create a Talend routine with a simple method that generates a UUID:

public class Utils {

    /**
     * getRandom: return a random UUID
     * 
     * 
     * {talendTypes} String
     * 
     * {Category} User Defined
     * 
     * {param} string("world") input: dummy input
     * 
     * {example} getRandom("world") # 01e98b98-05d6-427c-978d-1f86d0ea4712
     */
    public static String getRandom(String input) {
        return java.util.UUID.randomUUID().toString();
    }
}

You can then access this function from tRowGenerator: enter image description here

enter image description here

One more thing, I'm not sure what exactly is your requirement, but since you don't have a join key between your inputs, you get are getting a cartesian join between all your inputs (42x298x206 rows). So you might want to define a join condition.
If you do define a join condition, make sure the tMap inputs are in the right order (you are using the tRowGenerator flow as a main connection, and others as lookup).

Ibrahim Mezouar
  • 3,981
  • 1
  • 18
  • 22
  • Thx for your feedback, i really appreciate it.For the Random Data , i'm already using the function you mentioned to get random UUID, but my problem is more aboout not finding a way to join my 3 data sources, because they have nothing in common. – MarTech May 06 '20 at 21:39
  • Ok, could you share some more detail about your input and your expected output? – Ibrahim Mezouar May 06 '20 at 21:43
  • I have 3 input sources : delimited file, tRowGenerator and SQL Server DB as shown in the pictures above, and i'm expecting to populate a csv file out of them. – MarTech May 06 '20 at 22:12
  • Yes that part was clear enough :) I meant posting some sample input data, and how it should be joined, and your expected output – Ibrahim Mezouar May 06 '20 at 22:15
  • Yes, check my question above, i edited it by posting pictures of the input and output attributes, or do you mean data? – MarTech May 06 '20 at 22:30
  • Sample data would be great :) – Ibrahim Mezouar May 06 '20 at 22:52
  • I hope the picture containing sample data is clear :| – MarTech May 06 '20 at 23:21