I have a data file with lines containing a huge amount (~ 5K) of dates in format yy-dd-mm
.
A tipical file line could be:
bla bla 21-04-26 blabla blabla 18-01-28 bla bla bla bla 19-01-12 blabla
I need to do this kind of replacement for any single date:
$ date --date="18-01-28" "+%A, %d %B %Y"
Sunday, 28 January 2018
I already solved this problem using sed (see the post scriptum for details).
I would like to use gawk, instead. I came up with this command:
$ gawk '{b = gensub(/([0-9]{2}-[0-9]{2}-[0-9]{2})/,"$(date --date=\"\\1\" \"+%A, %d %B %Y\")", "g")}; {print b}'
The problem is that bash does not expand the date command inside gensub, in fact I obtain:
$ echo "bla bla 21-04-26 blabla blabla 18-01-28 bla bla bla bla 19-01-12 blabla" | gawk '{b = gensub(/([0-9]{2}-[0-9]{2}-[0-9]{2})/,"$(date --date=\"\\1\" \"+%A, %d %B %Y\")", "g")}; {print b}'
bla bla $(date --date="21-04-26" "+%A, %d %B %Y") blabla blabla $(date --date="18-01-28" "+%A, %d %B %Y") bla bla bla bla $(date --date="19-01-12" "+%A, %d %B %Y") blabla
I do not get how I could modify the gawk command to obtain the desired result:
bla bla Monday, 26 April 2021 blabla blabla Sunday, 28 January 2018 bla bla bla bla Saturday, 12 January 2019 blabla
post scriptum:
For what concerns sed, I solved with this script
#!/bin/bash
#pathFile hard-coded here
pathFile='./data.txt'
#treshold to avoid "to many arguments" error with sed
maxCount=1000
counter=0
#list of dates in the data file
dateList=($(egrep -o "[0-9]{2}-[0-9]{2}-[0-9]{2}" "$pathFile" | sort | uniq))
#string to pass multiple instruction to sed
sedCommand=''
for item in ${dateList[@]}
do
sedCommand+="s/"$item"/"$(date --date="$item" "+%A, %d %B %Y")"/g;"
(( counter++ ))
if [[ $counter -gt $maxCount ]]
then
sed -i "$sedCommand" "$pathFile"
counter=0
sedCommand=''
fi
done
[[ ! -z "$sedCommand" ]] && sed -i "$sedCommand" "$pathFile"