I hope this cut-down version of my previous question sounds much better than earlier. I have a .CSV file (frequency - weekly) whose header contains the year-week value in two of the columns which keeps changing every week. For an instance please see below.
store nbr,city,state,region,item nbr,category,201420 sales,201420 sold qty
1001,Kansas City,MO,2,30945232,7,$11,1
1003,Kansas City,KS,3,30945742,6,$40,4
2004,New York,NY,6,30995232,11,$72,12
6003,Sacramento,CA,4,3025742,3,$24,6
9003,Cleveland,OH,9,3057432,12,$44,11
6009,Los Angeles,CA,4,3098294,42,$35,7
I have to check if the data is placed under the right column and then proceed loading that data into the datamart. For this reason, I tried to create a standard header to compare it with the header of this file, which would be something like below:
store nbr,city,state,region,item nbr,category,XXXXXX sales,XXXXXX sold qty
If my file (that I am about to load) is of the format above, my process should proceed loading the data. If the columns are not in their places, lets say as below
store nbr,region,city,state,category,item nbr,201420 sold qty,201420 sales
1001,2,Kansas City,MO,7,30945232,1,$11
1003,3,Kansas City,KS,6,30945742,4,$40
2004,6,New York,NY,11,30995232,12,$72
6003,4,Sacramento,CA,3,3025742,6,$24
9003,9,Cleveland,OH,12,3057432,11,$44
6009,4,Los Angeles,CA,42,3098294,7,$35
then the process should fail. My main concern is - due to the dynamic nature of the fields 7 and 8, I am not able to create/match the data with the standard header every week. There is no date in the field name and I cannot use sysdate (as I might have to use old files with very old dates in the 7th and 8th fields). Could you please suggest me how can I achieve this either using cat/awk/sed
or by using any wildcards for the Year-week value in the header?