-1

I am trying to replace part of filenames based on matching string of filename from another file. Filenames are in following format:

36872_20190806_00.csv  40800_20190806_00.csv  41883_20190806_00.csv  
38064_20190806_00.csv  40848_20190806_00.csv  41891_20190806_00.csv  
38341_20190806_00.csv  40856_20190806_00.csv  41923_20190806_00.csv  
40417_20190806_00.csv  40948_20190806_00.csv  44373_20190806_00.csv  
40745_20190806_00.csv  41217_20190806_00.csv  45004_20190806_00.csv 
40754_20190806_00.csv  41256_20190806_00.csv                

where digits before first _ represent station code, which I want to replace with its station name from another file named radiosonde.csv. For example : I want

change 36872_20190806_00.csv to ALMATY_20190806_00.csv

change 38064_20190806_00.csvto KYZYLORDA_20190806_00.csv

Data of radiosonde is as given below:

CODE,LAT,LON,Elevation,STN_NAME
41620,31.35,69.467,1407,ZHOB
41600,32.5,74.5333,255,SIALKOT
41598,32.9333,73.7167,232,JHELUM
41594,32.05,72.667,188,SARGODHA
41571,33.6167,73.1,507,ISLAMABAD_AIRPORT
41560,33.8667,70.0833,1725,PARACHINAR
41529,34.0333,71.9333,329,PESHAWAR
41516,35.9167,74.3333,1453,GILGIT
41515,35.5667,71.7833,1464,DROSH
41506,35.9217,71.8,1499,CHITRAL
41316,17.0439,54.1022,23,SALALAH_AIRPORT
41288,20.667,58.9,19,MASIRAH
41256,23.5953,58.2983,8.4,MUSCAT_INTL_AIRPORT
41217,24.4333,54.65,16,ABU_DHABI_INTL_AIRPOR
41169,25.2731,51.6081,4,HAMAD_INTL_AIRPORT
40990,31.5,65.85,1010,KANDAHAR_AIRPORT
40948,34.55,69.2167,1791,KABUL_AIRPORT
40938,34.217,62.217,977,HERAT
40913,36.6667,68.9167,433,KUNDUZ
40911,36.7,67.2,378,MAZAR-I-SHARIF
40875,27.2167,56.3667,10,BANDARABBASS
40856,29.4667,60.8833,1370,ZAHEDAN
40848,29.5333,52.6,1484,SHIRAZ
40841,30.25,56.9667,1748,KERMAN
40821,31.9,54.2833,1238,YAZD
40811,31.3333,48.6667,20,AHWAZ
40809,32.8667,59.2,1491,BIRJAND
40800,32.5175,51.7061,1550.4,ESFAHAN
40754,35.6833,51.3167,1204,TEHRAN-MEHRABAD
40745,36.2667,59.6333,999,MASHHAD
40427,26.267,50.617,2,BAHRAIN
40417,26.45,49.8167,22,KING_FAHD_INTL_AIRPORT
40416,26.267,50.167,19,DHAHRAN
3992,10.83,106.97,11,AN_LOC
38989,35.9,62.9667,375,TAGTABAZAR
38954,37.5,71.5,2077,KHOROG
38927,37.233,67.267,310,TERMEZ
38880,37.987,58.361,211,ASHGABAT_KESHI
38836,38.55,68.783,800,DUSHANBE
38750,37.467,53.967,-22,ESENGYLY
38687,39.083,63.6,190,CHARDZHEV
38613,40.917,72.95,765,DZHALAL-ABAD
38606,40.55,70.95,499,KOKAND
38599,40.217,69.733,427,KHUDJAND
38507,40.0333,52.9833,90,TURKMENBASHI
38457,41.267,69.267,493,TASHKENT
38413,41.733,64.617,237,TAMDY
38392,41.833,59.983,87,DASHKHOVUZ
38353,42.833,74.583,760,BISHKEK
38341,42.85,71.3,652,TARAZ
38064,44.7667,65.5167,133.4,KYZYLORDA
38001,44.55,50.25,-25,FORT SHEVCHENKO
37985,38.733,48.833,-11,LANKARAN
37860,40.5333,50,27,MASHTAGA
36974,41.433,76,2041,NARYN
36872,43.3633,77.0042,662.7,ALMATY
36859,44.167,80.067,645,ZHARKENT
3369,22.77,88.37,0,BARAKPUR
3368,25.88,89.43,0,LALMANIR_HAT

I looked into this question. As suggested there, I tried :

sort -r radiosonde.csv | awk -F"," '{print "for files in *00.csv; do mv $files ${files/" $1 "/" $5 "}; done" }'  | bash

It did work in some sense. It renamed some files and left few as it is and gave error as:

bash: line 25: unexpected EOF while looking for matching `''
bash: line 113: syntax error: unexpected end of file

I am not understanding why it's behaving so strangely with some files. If I'll take those filenames and put them into some another file say test.csv and use above command again i.e.

sort -r test.csv | awk -F"," '{print "for files in *00.csv; do mv $files ${files/" $1 "/" $5 "}; done" }'  | bash

then it will rename all those files which were left earlier. Is there any way to do it using shell script. I tried following script but didn't work:

for file in *00.csv ; do 
         mv $files ${files/" $1 "/" $5 "}; 
done < radiosonde.csv
Ajay
  • 320
  • 2
  • 11
  • 1
    The suggestion you're trying to follow is a bad idea implemented poorly, copy paste the shell code it outputs into http://shellcheck.net and it'll tell you about **some** of the issues. It's hard enough to write shell correctly, never mind trying to write awk correctly to generate correct shell and it's completely unnecessary to do so. Please reduce your example to, say, 3 files instead of 50 or however many that is and the same for number of lines in your CSV and add the expected output given that input so we can help you. – Ed Morton Jan 30 '21 at 16:58
  • Again, Please reduce your example to, say, 3 files instead of 50 (or 17 now) or however many that is and **the same for number of lines in your CSV** and **add the expected output** given that input so we can help you. There's just no reason for us to have to wade through all of that data to try to understand your problem and we cant test a potential solution without you providing the expected output. When asking questions always post a [mcve], emphasis on **Minimal**. – Ed Morton Jan 30 '21 at 17:51
  • @EdMorton I have edited my question and mentioned the required output format. If I'll reduce number of lines in ```radiosonde.csv``` then it works well. I have to rename hundreds of files and above command behaves strangely if number of lines in ```radiosonde.csv``` are more. – Ajay Jan 30 '21 at 18:03
  • Last time: Your list of files is still needlessly 17 files when 3 files would do, your CSV is till needlessly about 50 lines long when 3 lines would do, and mentioning the output format (or just showing 2 lines of output for 17 lines of input) is not adequate - **show the actual expected output** given the input you posted. See [ask]. – Ed Morton Jan 30 '21 at 18:09
  • @EdMorton For 3 or 10 files above command is working fine so it won't be an issue there. The expected output is given in question I want to change station code (numbers before first _) with corresponding station name from radiiosonde.csv. for eg. ```36872_20190806_00.csv``` to ```ALMATY_20190806_00.csv```. Here ALMATY is station name corresponding to station code ```36872``` in radiosonde.csv – Ajay Jan 30 '21 at 18:13
  • I tried. Good luck. – Ed Morton Jan 30 '21 at 18:14
  • ``Data of radiosonde is as given below:`` What is the filename of this ``Data of radiosonde``? – Darkman Jan 31 '21 at 01:25
  • @Darkman ```radiosonde.csv``` is the filename which has information about station names. – Ajay Jan 31 '21 at 02:47
  • Your title ``awk renamed few files and left few to renamed``. ``left few to renamed`` in what conditions? Did you mean that you wanted to rename part of the filename for all files and left nothing? – Darkman Jan 31 '21 at 02:56
  • @Darkman I want to replace station code (numbers before first _) in filename for all files with its associated station name from ```radiosonde.csv``` file. The command I used did the job for some files. It renamed some files and didn't work for some files (i.e. it didn't renamed some files). – Ajay Jan 31 '21 at 03:09

1 Answers1

1

What about this:

Make sure that radiosonde.csv file along with all the csv files that you want to rename in the same directory.

$ cd <directory of radiosonde.csv, 36872_20190806_00.csv, 38064_20190806_00.csv and so on...>
$ ls *.csv > .tmp; awk -F ',' '{name[$1]=$5}END{for(;(getline filename < ".tmp")>0;){ori=filename;sub(/_.+$/,"",filename);pre=filename;sub(/^[0-9]+/,"",ori);post=ori;if(name[pre]!="")system("mv " pre post " " name[pre] post)}} ' 'radiosonde.csv'
$ rm -f '.tmp'

Explanation:

  • ls *.csv > .tmp -> List all files in current dir and write them into .tmp
  • awk -F ',' -> Set , (comma) as the field separator for awk. Because we want to split lines like 41620,31.35,69.467,1407,ZHOB into separate fields. Then we can get them via $1, $2, $3 and so on.
  • '{ ... }END{}' -> This is awk's blocks. First block for reading input files and the later will be execute before awk program exits.
  • 'radiosonde.csv' Set this as input file to feed awk for reading.
  • '{name[$1]=$5}' -> $1 is the first field and $5 is the 5'th one. In this case $1 would be 41620, 41600 and so on and $5 would be ZHOB, SIALKOT and etc. name is an array. When we read the first line, we set name[CODE]=STN_NAME and name[41620]=ZHOB for the second line.
  • END{}' -> After we the set all the variables we needed, we need to rename the files and END{} is one of the block we can used for that purpose.
  • for(;(getline filename < ".tmp")>0;) {} -> This is for reading .tmp file that contains list of files that we want to rename.
  • ori=filename; -> Set variable filename to another variable. This is because we want to use sub() function that will alter the variable but still need filename variable to get the remaining part of the filename.
  • sub(/_.+$/,"",filename); -> This is to remove characters that we don't want to. In this case from character _ to the end. For example, if filename is 41620_20190806_00.csv, _20190806_00.csv will be removed and filename will become 41620.
  • pre=filename; -> Set filename to another variable called pre for clarity.
  • sub(/^[0-9]+/,"",ori); -> This will remove the leading numbers so ori will become _20190806_00.csv.
  • post=ori; -> Set ori to another variable in this case post.
  • if(name[pre]!="") -> Because radiosonde.csv will be inside .tmp and is not one of the files that we want to rename, we need this if statement so that we don't receive any error for the next command. name[radiosonde] will be empty.
  • system("mv " pre post " " name[pre] post) -> What this statement does would be renaming your file. If pre is 41620 and post is _20190806_00.csv, this statement can be translate into this "mv 41620_20190806_00.csv ZHOB_20190806_00.csv".
  • rm -f '.tmp' -> Delete .tmp file because we don't need it anymore.

Ignore my commend below. We do need the if statement.

Darkman
  • 2,941
  • 2
  • 9
  • 14
  • Wow...It worked flawlessly for all files. Can you please explain syntax of ```awk``` command you used. – Ajay Jan 31 '21 at 05:19
  • That ``if(name[pre]="")`` statement is probably unnecessary. Well doesn't matter I guess. – Darkman Jan 31 '21 at 09:26