-2

I'm trying to shorten a filename while preserving the extension.

I think cut may be the best tool to use, but I'm not sure how to preserve the file extension.

For example, I'm trying to rename abcdefghijklmnop.txt to abcde.txt

I'd like to simply lop off the end of the filename so that the total character length doesn't exceed [in this example] 5 characters.

I'm not concerned with filename clashes because my dataset likely won't contain any, and anyway I'll do a find, analyze the files, and test before I rename anything.


The background for this is ultimately that I want to mass truncate filenames that exceed 135 characters so that I can rsync the files to an encrypted share on a Synology NAS.

I found a good way to search for all filenames that exceed 135 characters: find . -type f | awk -F'/' 'length($NF)>135{print $0}'

And I'd like to pipe that to a simple cut command to trim the filename down to size. Perhaps there is a better way than this. I found a method to shorten filenames while preserving extensions, but I need to recurse through all sub-directories.

Any help would be appreciated, thank you!

Update for clarification:

I'd like to use a one-liner with a syntax like this:

find . -type f | awk -F'/' 'length($NF)>135{print $0}' | some_code_here_to_shorten_the_filename_while_preserving_the_extension

john
  • 11
  • 1
  • 5
  • Is there any worry that your method of shortening might result in non-unique file names? – jas Mar 14 '20 at 19:33
  • Great question @jas, no I'm not worried about that. With my dataset, there's not a chance that duplicate names will be created. And anyway, I will find all of the long filenames in advance and analyze them before running the script. – john Mar 14 '20 at 21:01
  • Can you edit your question to add an explanation and/or psuedo code showing how you want to rename the files? Do you literally just want to take the first five characters of the base name + the extension? – jas Mar 14 '20 at 21:25
  • 1
    Thanks @jas I updated the question. I'm trying to make a one-liner like this: `find . -type f -iname "*" |awk -F'/' 'length($NF)>135{print $0}' | some_code_here` so that it recurses through the sub-directories and lops off the ends of all long filenames. – john Mar 14 '20 at 21:35
  • Note that `-iname "*"` has no effect; if you're not actually filtering by filename at all, you can just write `find -type f`. – ruakh Mar 14 '20 at 22:00
  • Thank you @ruakh -- I'll update my question. – john Mar 14 '20 at 22:38
  • I think I've botched this by not asking my question clearly enough... – john Mar 15 '20 at 02:19
  • you got two answers, don't you think that it's a bit rude not to give any feedback to your answerers? – oguz ismail Mar 15 '20 at 06:07

3 Answers3

1

With GNU find and bash:

export n=10 # change according to your needs
find . -type f                      \
     ! -name '.*'                   \
       -regextype egrep             \
     ! -regex '.*\.[^/.]{'"$n"',}'  \
       -regex '.*[^/]{'$((n+1))',}' \
       -execdir bash -c '
    echo "PWD=$PWD"
    for f in "${@#./}"; do
        ext=${f#"${f%.*}"}
        echo mv -- "$f" "${f:0:n-${#ext}}${ext}"
    done' bash {} +

This will perform a dry-run, that is it shows folders followed by the commands to be executed within them. Once you're happy with its output you can drop echo before mv (and echo "PWD=$PWD" line too if you want) and it'll actually rename all the files whose names exceed n characters to names exactly of n characters length including extension.

Note that this excludes hidden files, and files whose extensions are equal to or longer than n in length (e.g. .hidden, app.properties where n=10).

oguz ismail
  • 1
  • 16
  • 47
  • 69
  • This runs beautifully -- although it did miss one file. I've examined the results and it seems that that file has an `“` in the filename. – john Mar 15 '20 at 13:59
  • I'm trying to create a one-liner I can use in the terminal instead of a script.. I'm getting close with this, but it doesn't quite work (assuming all extensions are a dot and three characters): `find . -type f | awk -F'/' 'length($NF)>135{print $0}' | xargs -0 rename -n 's/(.*\/)(.{131}).*([.].*$)/$1$2$3/'` – john Mar 15 '20 at 14:03
  • @john Well, you can turn this into a one-liner (a really long one) easily, or you can save this into a script and use it directly. Wrt failure you mentioned about, I could take a look if you post the filename that causes problems – oguz ismail Mar 15 '20 at 14:22
  • Thanks @oguz .. the filename in question is: `Governor Insists “No Insult Intended,” as “Blasphemy” Trial Begins.htm` I'm examining the filenames in SublimeText and the double quotes are slanted, as opposed to standard straight double quotes. Also, I'm getting very close with `find . -type f | awk -F/ 'length($NF)>135{print $0}' | sed -r 's_(.*/+)(.{0,131}).*([.].*$)_\1\2\3_'` but alas I'm getting confused with the pipes and arguments, so researching how to get an `mv` in there. Thank you for your kind patience and help. :-) – john Mar 15 '20 at 15:00
  • I created a file named like that and the script above renames it to Govern.htm successfully. I didn't understand your script though – oguz ismail Mar 15 '20 at 16:01
  • Thanks for trying it out. That's very strange, I'm on a debian 9 system in case that makes a difference. Hmm... – john Mar 15 '20 at 18:25
  • The `“` are 3 bytes - e2 80 9c, so it will count as 3 bytes. To how many characters are you truncating? You would need an utf-8 `cut`-like tool. You could for example first substitute `“` for `"`. – KamilCuk Mar 15 '20 at 18:34
0

Update

Since your file are distributed in a tree of directories, you can use my original approach, but passing the script to a sh command passed to the -exec option of find:

n=5 find . -type f -exec sh -c 'f={}; d=${f%/*}; b=${f##*/}; e=${b##*.}; b=${b%.*}; mv -- "$f" "$d/${b:0:n}.$e"' \;

Original answer

If the filename is in a variable x, then ${x:0:5}.${x##*.} should do the job.

So you might do something like

n=5 # or 135, or whatever you like
for f in *; do
  mv -- "$f" "${f:0:n}.${f##*.}"
done

Clearly this assumes that there are no clashes between the shortened names. If there are clashes, then only one would survive! So be careful.

Enlico
  • 23,259
  • 6
  • 48
  • 102
  • Sorry Enrico, I wasn't quite able to make your answer work. – john Mar 15 '20 at 15:01
  • @john, I have added a possible usecase. – Enlico Mar 15 '20 at 15:46
  • I tried `find . -type f -exec sh -c 'f={}; d=${f%/*}; b=${f##*/}; echo mv -- "$f" "$d/${b:0:135}.${b##*.}"' \;` on my test set, but it returned `sh: 1: Bad substitution` – john Mar 15 '20 at 18:09
  • @john, what if you use `bash` instead of `sh`? By the way, I forgot to remove `echo` (which I used to print the commands instead of executing them, just to be safe while making it up). – Enlico Mar 15 '20 at 18:10
  • If I use `bash` instead of `sh` then it shortens the filename (including the extension), and then it appends the extension afterwards. For example `aaaaa136.txt` becomes `aaaaa136.tx.txt` – john Mar 15 '20 at 18:22
  • @john, that's because `aaaaa136.txt` is already shorter than 135 characters. It should work now. – Enlico Mar 15 '20 at 18:30
0

use bash string manipulations
Details: https://www.linuxtopia.org/online_books/advanced_bash_scripting_guide/string-manipulation.html.
scroll to "Substring Extraction"

example below cut filename to 10 chars preserving extension

~ % cat test   

rawFileName=$(basename "$1")
filename="${rawFileName%.*}"
ext="${rawFileName##*.}"

if [[ ${#filename} < 9 ]]; then
    echo ${filename:0:10}.${ext}
else
    echo $1
fi

And tests:

~ % ./test 12345678901234567890.txt
1234567890.txt
~ % ./test 1234567.txt             
1234567.txt
Tomasz Czyżak
  • 1,118
  • 12
  • 13