2

Requirement is, The source files structure will be changed on daily basis / dynamically. how we can achieve in Informatica could:

For example,

Let's consider the source is a flat file with different formats like with header, without header, different metadata(today file with 4 columns and tomorrow its 7 different columns and day after tomorrow without header , another day file with count of records in file)

I need to consume all dynamically changed files in one informatica cloud mapping. could you please help me on this.

Jyo
  • 47
  • 5
  • Is there any identifier that will tell you whether the first row is a header or data in any file? Similarly for a footer record? Are fields always delimited by the same character in all files? – NickW Sep 10 '20 at 20:00
  • Nothing like that 1st row is header or data and same no information on footer record. Yes files are pipe delimited . – Jyo Sep 10 '20 at 20:39

2 Answers2

0

This is a tricky situation. I know its not a perfect solution but here is my idea-
create a source file structure having maximum number of columns of type text, say 50. Read file, apply filter to cleanup header data etc. Then use router to treat files as per their structure - may be filename can give you a hint what it contains. Once you identify the type of file, treat,convert columns according to their data type and load into correct target.
Mapping would look like Source -> SQ -> EXP -> FIL -> RTR -> TGT1, TGT2

There has to be a pattern to identify the dynamic file structure. HTH...

Koushik Roy
  • 6,868
  • 2
  • 12
  • 33
  • Thanks Koushik, I will try to implement the same .But not sure how many files are coming daily and how many types of file names are there and I have only one staging target to load all files.I should not revisit the mapping if any new file comes. Everything(file name, file structure , data) is dynamic nothing is defined as static in requirement. Any other suggestion please. – Jyo Sep 10 '20 at 18:13
  • I guess, then you need to 1. put all files in a list. 2. load them into the giant table (with max no of columns) as i said. Add a column called file name. 3. from this table load data into your table – Koushik Roy Sep 10 '20 at 18:16
  • Thanks again Koushik. I tried to put all files in one file and took source as file list but it is throwing error because of metadata is different for all files. I am using Informatica cloud. – Jyo Sep 10 '20 at 20:59
  • looks like the data type is different for all. Did you convert all data types to string ? if you convert all data type to sting in source, source qualifier it shouldnt throw error. it will load as is. – Koushik Roy Sep 11 '20 at 03:54
  • Thanks again Koushik. I am using Informactica cloud and source qualifier concept is not there. I read data as string from file. if we put all files in a list, metadata should be same for all files correct? – Jyo Sep 11 '20 at 13:05
  • I am thinking too, it's a complex situation and it seems to me your metadata(col name, length, datatype) is completely dynamic. Such unpredictable metadata can only be handled by scripts/programs. I will let you know if I have anything. – Koushik Roy Sep 12 '20 at 06:58
0

To summarise my understanding of the problem:

  1. You have a random number of file formats
  2. You don't know the file formats in advance
  3. The files don't contain the necessary information to determine their format.

If this is correct then I don't believe this is a solvable problem in Informatica or in any other tool, coding language, etc. You don't have enough information available to enable you to define the solution.

The only solution is to change your source files. Possibilities include:

  • a standard format (or one of a small number of standard formats with information in the file that allows you to programatically determine the format being used)
  • a self-documenting file type such as JSON
NickW
  • 8,430
  • 2
  • 6
  • 19
  • I don't think that will be helpful - even if you split the data how would you know what data any field contained and therefore what to do with it? – NickW Sep 11 '20 at 15:25