0

I have been trying to do this all afternoon and cannot figure out how to do this. I'm running MXLinux and from the commandline am trying (unsucessfully) to batch edit a bunch of filenames (I've about 500 so don't want to do this by hand) from:

2020-August-15.pdf
2021-October-15.pdf

To:

2020-08-15.pdf
2021-10-15.pdf

I cannot find anything that does this (in a way I understand) so am wondering. Is this possible or am I to do this by hand?

Admittedly I'm not very good with Bash but I can use sed, awk, rename, date, etc. I just can't seem to find a way to combine them to rename my files.

I cannot find anything on here that has been of any help in doing this.

Many thanks.

EDIT:

I'm looking for a way to combine commands and ideally not have to overtly for-loop through the files and the months. What I mean is I would prefer, and was trying to, pipe ls into a command combination to convert as specified above. Sorry for the confusion.

EDIT 2:

Thank you to everyone who came up with answers, and for you patience with my lack of ability. I don't think I'm qualified to make a decision as to the best answer however have settled, for my use-case on the following:

declare -A months=( [January]=01 [February]=02 [March]=03 [April]=04 [May]=05\
[June]=06 [July]=07 [August]=08 [September]=09 [October]=10 [November]=11 [December]=12 )

for oldname in 202[01]-[A-za-z]*-15.pdf
do
    IFS=-. read y m d ext <<< "${oldname}"
    mv "$oldname" "$y-${months[$m]}-$d.$ext"
done

I think this offer the best flexibility. I would have liked the date command but don't know how to not have the file extension hard coded. I was unaware of the read command or that you could use patterns in the for-loop.

I have learned a lot from this thread so again thank you all. Really my solution is a cross of most of the solutions below as I've taken from them all.

sylph
  • 125
  • 2
  • 9
  • 3
    If you can use sed... it's not the most efficient approach, but `newname=$(echo "$oldname" | sed -e 's/January/01/' -e 's/February/02/' -e 's/March/03/' ...)` is pretty darned simple / beginner-accessible. Can you describe the specific problem you encountered trying to do this yourself? – Charles Duffy Aug 28 '21 at 17:02
  • 1
    Linked a duplicate that describes combining sed and mv. Between that and the suggestion in the comment above, you should have everything you need. – Charles Duffy Aug 28 '21 at 17:04
  • Yeah. I was trying to do a one liner to change the month name to its equivalent number. I thought you might be able to use the `date` command **within** the `rename` command as a way to avoid overtly looping through all the filenames and using some array of month names/name index type thing. I thought since there was a command that could convert a month name to a month number it would be more efficient than having to write out all names and corresponding numbers, seems that is impossible. – sylph Aug 28 '21 at 17:32
  • 1
    Assuming you have the GNU version of date(1), you could use `date -d` to map the month names to numbers: `for f in *.pdf; do IFS=- read y m d <<<"${f%.pdf}"; mv "$f" "$(date -d "$m $d, $y" +%F.pdf)"; done` – Mark Reed Aug 28 '21 at 17:36
  • Re: looping -- you don't need to call `sed` once per filename; you can have one sed command convert _all_ the input filenames, and pair its output up with the originals. That's much more efficient than calling `date` once per filename too (which is likewise pretty darned slow; not that it would matter all that much with only 500 filenames). – Charles Duffy Aug 28 '21 at 17:40
  • Anyhow, "how do I do X in a beginner-friendly way?" and "how do I do X efficiently at scale?" are two different questions, and it's not clear which one you're asking. From the body of the question, I took it to be the former. – Charles Duffy Aug 28 '21 at 17:43
  • @MarkReed are you able to explain that? Mostly the `read` part and the `<<<` bit which I've never seen before. – sylph Aug 28 '21 at 17:59
  • @CharlesDuffy I'm sorry for the confusion. I was confused my self. Are you able to demonstrate what you mean about `sed` to convert *all* the filenames? I did it just now like: `for oldname in $(ls); do newname=$(echo $oldname | sed -e 's/January/01/' -e 's/February/02/' -e 's/March/03/' -e 's/April/04/' ...) ; mv "$oldname" "$newname" ; done` which looks horrendous. I'm guessing it's pretty inefficient, is it not calling `sed` on each filename and testing multiple patterns against each one? – sylph Aug 28 '21 at 18:03
  • With Larry Wall's `rename`: `rename 's/August/08/; s/October/10/' *.pdf` – Cyrus Aug 28 '21 at 18:14
  • @slyph see my answer for the explanation. – Mark Reed Aug 28 '21 at 23:14

6 Answers6

3

With just Bash built-ins, try

months=(\
    January February March April May June \
    July August September October November December)

for file in ./*; do
     dst=$file
    for ((i=1; i<=${#months[@]}; ++i)); do
        ((i<10)) && i=0$i
        dst=${dst//${months[$i]}/$i}
    done
    mv -- "$file" "$dst"
done

This builds up an array of month names, and loops over it to find the correct substitution.

The line ((i<10)) && i=0$i adds zero padding for single-digit month numbers; remove it if that's undesired.

As an aside, you should basically never use ls in scripts.

The explicit loop could be avoided if you had a command which already knows how to rename files, but this implements that command. If you want to save it in a file, replace the hard-coded ./* with "$@", add a #!/bin/bash shebang up top, save it as monthrenamer somewhere in your PATH, and chmod a+x monthrenamer. Then you can run it like

monthrenamer ./*

to rename all the files in the current directory without an explicit loop, or a more restricted wildcard argument to only select a smaller number of files, like

monthrenamer /path/to/files/2020*.pdf

You could run date twelve times to populate the array, but it's not like hard-coding the month names is going to be a problem. We don't expect them to change (and calling twelve subprocesses at startup just to avoid that seems quite excessive in this context).

As an aside, probably try to fix the process which creates these files to produce machine-readable file names. It's fairly obvious to a human, too, that 2021-07 refers to the month of July, whereas going the other way is always cumbersome (you will need to work around it in every tool or piece of code which wants to order the files by name).

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • This could perhaps work with associative arrays, too; but this should be portable back to older versions of Bash which lack that facility. In particular, MacOS still ships with Bash 3. – tripleee Aug 28 '21 at 18:28
  • 1
    you don't need the backslash on the first line, FWIW; the open paren is enough to tell bash that the array initializer continues on the next line. – Mark Reed Aug 29 '21 at 01:30
2

Assuming you have the GNU version of date(1), you could use date -d to map the month names to numbers:

for f in *.pdf; do 
  IFS=- read y m d <<<"${f%.pdf}"
  mv "$f" "$(date -d "$m $d, $y" +%F.pdf)"
done

I doubt it's any more efficient than your sed -e 's/January/01/' -e 's/February/02/' etc, but it does feel less tedious to type. :)

Explanation:

  1. Loop over the .pdf files, setting f to each filename in turn.

  2. The read line is best explained right to left:
    a. "${f%.pdf}" expands to the filename without the .pdf part, e.g. "2020-August-15".
    b. <<< turns that value into a here-string, which is a mechanism for feeding a string as standard input to some command. Essentially, x <<<y does the same thing as echo y | x, with the important difference that the x command is run in the current shell instead of a subshell, so it can have side effects like setting variables.
    c. read is a shell builtin that by default reads a single line of input and assigns it to one or more shell variables.
    d. IFS is a parameter that tells the shell how to split lines up into words. Here we're setting it – only for the duration of the read command – to -. That tells read to split the line it reads on hyphens instead of whitespace; IFS=- read y m d <<<"2020-August-15" assigns "2020" to y, "August" to m, and "15" to d.

  3. The GNU version of date(1) has a -d parameter that tells it to display another date instead of the current one. It accepts a number of different formats itself, sadly not including "yyyy-Mon-dd", which is why I had to split the filename up with read. But it does accept "Mon dd, yyyy", so that's what I pass to it. +%F.pdf tells it that when it prints the date back out it should do so ISO-style as "yyyy-mm-dd", and append ".pdf" to the result. ("%F" is short for "%Y-%m-%d"; I could also have used -I instead of +anything and moved the .pdf outside the command expansion.)
    f. The call to date is wrapped in $(...) to capture its output, and that result is used as the second parameter to mv to rename the files.

Mark Reed
  • 91,912
  • 16
  • 138
  • 175
2

If you are using GNU utilities and the Perl version of rename (not the util-linux version), you can build a one-liner quite easily:

rename "$(
  seq -w 1 12 |
  LC_ALL=C xargs -I@ date -d 1970-@-01 +'s/^(\d{4}-)%B(-\d{2}\.pdf)$/$1%m$2/;'
)" *.pdf

You can shorten if you don't care about safety (or legibility)... :-)

rename "$(seq -f%.f/1 12|date -f- +'s/%B/%m/;')" *.pdf
jhnc
  • 11,310
  • 1
  • 9
  • 26
2

Another way with POSIX shell:

# Iterate over pattern that will exclude already renamed pdf files
for file in [0-9][0-9][0-9][0-9]-[^0-9]*.pdf
do
  # Remove echo if result match expectations
  echo mv -- "$file" "$(
    # Set field separator to - or . to split filename components
    IFS=-.
    # Transfer filename components into arguments using IFS
    set -- $file
    # Format numeric date string
    date --date "$3 $2 $1" '+%Y-%m-%d.pdf'
  )"
done
Léa Gris
  • 17,497
  • 4
  • 32
  • 41
  • 1
    So many questions. What does the ` -- ` do (no the one in the date command)? How can you do `$(... code...).$4` when the `code` bit inside the brackets is where the `$4` comes from (if I'm understanding the code correctly). I think I understand the rest. – sylph Aug 28 '21 at 19:05
  • @sylph well I should not have put $4 after the subshell forming the arguments ^^. Fixing this right-now. For `set -- $file` the `--` indicate to replace the arguments array with the following arguments. – Léa Gris Aug 28 '21 at 19:48
  • And in `mv --` it's a safeguard in case you pass in a file name which starts with a hyphen. Quoting won't help there; `mv --randomfile elsewhere` will produce an "unknown option" error, whereas `mv -- --randomfile elsewehere` works because `--` unambiguously identifies the end of the option arguments. – tripleee Aug 29 '21 at 09:05
0

What I mean is I would prefer, and was trying to, pipe ls into a command combination to convert as specified above.

Well, you may need to implement that command combination then. Here’s one consisting of a single “command” and in pure Bash without external processes. Pipe your ls output into that and, once satisfied with the output, remove the final echo

#!/bin/bash

declare -Ar MONTHS=(
  [January]=01
  [February]=02
  [March]=03
  [April]=04
  [May]=05
  [June]=06
  [July]=07
  [August]=08
  [September]=09
  [October]=10
  [November]=11
  [December]=12)

while IFS= read -r path; do
  IFS=- read -ra segments <<<"$path"
  segments[-2]="${MONTHS["${segments[-2]}"]}"
  IFS=- new_path="${segments[*]}"
  echo mv "$path" "$new_path"
done
Andrej Podzimek
  • 2,409
  • 9
  • 12
0

What is working for me in Mac OS 12.5 with GNU bash, version 3.2.57(1)-release (arm64-apple-darwin21) is the following :

for f in *.pdf; do  mv "$f" "$(echo $f |sed -e 's/Jan/-01-/gi' -e 's/Feb/-02-/gi'  -e 's/Mar/-03-/gi'  -e 's/Apr/-04-/gi'  -e 's/May/-05-/gi'  -e 's/jun/-06-/gi'  -e 's/Jul/-07-/gi'  -e 's/Aug/-08-/gi'  -e 's/Sep/-09-/gi'  -e 's/Oct/-10-/gi' -e 's/Nov/-11-/gi' -e 's/Dec/-12-/gi'  )"; done

Note the original file had the month expressed in three litters in my case :

./04351XXX73435-2021Mar08-2021Apr08.pdf