I want to create a job using Talend that imports multiple xls files to a MySQL database . I don't know the schema of those files (number of columns and names) but i can suppose that the header row corresponds to the columns names. My idea is to create a folder for each xls schema that contains the xls files and a configuration file(describes the schema and the mapping to the mySQL table). All xls files must be mapped to a single table that already exists. I wonder if this is feasible with Talend open Studio ? Any other solution is more than welcome.
Asked
Active
Viewed 764 times
0
-
Are the names of the table columns the same across the schema. I mean, Is the column "xxXXyyKK" also containing the same data in the column "xxXXyyKK" from another schema – user853710 Apr 27 '16 at 08:57
1 Answers
0
There is a a component on the Talend Exchange that you could utilize for this. The Component is for Excel specific, because it is not base on A fixed schema (like Talend is used for) and works by reading row by row, But it is Cell based and reads a cell by cell.
What I would do is to have a job to identify the schema, and depending on it would generate different insert queries. I would not have a job for every schema, but have the same job for all of them.
1. First job would go cell by cell through the first row (containing the header names) and populate them into a flow. (Row:1, Column:B, Value: "FullName")
2. second part of the job would be to go through cell by cell for the values and get them in a Key-Value way (Row:3, Column:B, Value: "Peter Jackson") into another flow
3. Denormalising the second flow could give you all of the values int the same Excel row.
4. A tMap woud give you the possibility to exchange the Column B with the Fullname attribute and have a concatenation going on that would give you the insert statement.
This is how I would approach it. It would take about an hour or two to do it, but should be doable

user853710
- 1,715
- 1
- 13
- 27
-
Ok . But what i need is to use an external configuration file specifying the input and output column. For exemple if i have an xls file with three columns (name,age,sex) and a config file specifying that the "name" column of this file should be mapped to "user" column of my table , can the tmap component load this dynamically? – Ghazouani Med amine Apr 27 '16 at 09:37
-
well, the first step is to load the column names, there you could pull any other config and match it by name and change what ever you need. Call it Step 1.1. If you do not want to use a column, you would simply leave it out of the config then and it would not be matched by the flow in the step 1. Joining with the flow from step 2 would resolut in only the data you want to be processed. I see no problem. – user853710 Apr 27 '16 at 10:28
-
If you have any problems, ask your boss to hire me as an consultant for a couple of hours. ;) – user853710 Apr 27 '16 at 10:29