22

I have a directory (with subdirectories), of which I want to find all files that have a ".ipynb" extension. But I want the 'find' command to just return me these filenames without the extension.

I know the first part:

find . -type f -iname "*.ipynb" -print    

But how do I then get the names without the "ipynb" extension? Any replies greatly appreciated...

Siavosh Mahboubian
  • 403
  • 1
  • 3
  • 6

9 Answers9

29

To return only filenames without the extension, try:

find . -type f -iname "*.ipynb" -execdir sh -c 'printf "%s\n" "${0%.*}"' {} ';'

or (omitting -type f from now on):

find "$PWD" -iname "*.ipynb" -execdir basename {} .ipynb ';'

or:

find . -iname "*.ipynb" -exec basename {} .ipynb ';'

or:

find . -iname "*.ipynb" | sed "s/.*\///; s/\.ipynb//"

however invoking basename on each file can be inefficient, so @CharlesDuffy suggestion is:

find . -iname '*.ipynb' -exec bash -c 'printf "%s\n" "${@%.*}"' _ {} +

or:

find . -iname '*.ipynb' -execdir basename -s '.sh' {} +

Using + means that we're passing multiple files to each bash instance, so if the whole list fits into a single command line, we call bash only once.


To print full path and filename (without extension) in the same line, try:

find . -iname "*.ipynb" -exec sh -c 'printf "%s\n" "${0%.*}"' {} ';'

or:

find "$PWD" -iname "*.ipynb" -print | grep -o "[^\.]\+"

To print full path and filename on separate lines:

find "$PWD" -iname "*.ipynb" -exec dirname "{}" ';' -exec basename "{}" .ipynb ';'
eastriver lee
  • 173
  • 2
  • 9
kenorb
  • 155,785
  • 88
  • 678
  • 743
  • Applying `basename` would also throw away the directory component. – user1934428 Feb 01 '17 at 07:39
  • I believe this is what is asked for, list only filenames without extension. – kenorb Feb 01 '17 at 10:07
  • Executing `basename` once per file seems rather inefficient. `find . -name '*.ipynb' -exec bash -c 'printf "%s\n" "${@%.*}"' _ {} +` would just invoke one shell per batch of files, so considerably less overhead. – Charles Duffy Aug 01 '17 at 15:14
  • @CharlesDuffy Added into the list. Wouldn't invoking bash on each file be inefficient as well? Or it's executed on all files because of `+`? – kenorb Aug 01 '17 at 15:21
  • 1
    @kenorb, the `+` means we're passing multiple files to each bash instance -- if the whole list fits into a single command line, we call bash only once. – Charles Duffy Aug 01 '17 at 15:38
  • @CharlesDuffy What happens if the whole list cannot fit into a single command? – IMTheNachoMan Apr 22 '19 at 03:22
  • 1
    @IMTheNachoMan, ...in that case, `-exec ... {} +` runs the command multiple times (each with a subset of the file list), just as `xargs` does. – Charles Duffy Apr 22 '19 at 16:37
13

Here's a simple solution:

find . -type f -iname "*.ipynb" | sed 's/\.ipynb$//1'
Vercingatorix
  • 1,838
  • 1
  • 13
  • 22
  • There's no need for the `/1`, as the pattern cannot match more than once (assuming no embedded newlines in filenames). – Toby Speight Mar 12 '18 at 11:54
  • 2
    I used this one since it doesn't fork a process like bash or basename for each file. A bit custom, but faster. – Pysis Jul 09 '19 at 01:56
5

I found this in a bash oneliner that simplifies the process without using find

for n in *.ipynb; do echo "${n%.ipynb}"; done
jjisnow
  • 1,418
  • 14
  • 5
1

If you need to have the name with directory but without the extension :

find .  -type f -iname "*.ipynb" -exec sh -c 'f=$(basename $1 .ipynb);d=$(dirname $1);echo "$d/$f"' sh {} \;
V. Michel
  • 1,599
  • 12
  • 14
  • It would be more correct to quote your expansions: `f=$(basename "$1" .ipynb);d=$(dirname "$1"); echo "$d/$f"` -- that way filenames with whitespace or glob characters are less prone to being problematic. – Charles Duffy Aug 24 '20 at 20:40
  • That said, this is pretty inefficient right now -- for each file, you're starting a new copy of `sh`, having at spawn a subshell and run the non-builtin program `/bin/basename` within it, and then another subshell invoking `/bin/dirname`. Using `-exec ... {} +` would let you share a single copy of `sh` across multiple filenames (though you'd need to iterate over them instead of hardcoding `$1`); even better would be to stream all your names through a single subprocess that does the work, with _no_ new per-name subprocesses being started at all. – Charles Duffy Aug 24 '20 at 20:42
0
find . -type f -iname "*.ipynb" | grep -oP '.*(?=[.])'

The -o flag outputs only the matched part. The -P flag matches according to Perl regular expressions. This is necessary to make the lookahead (?=[.]) work.

user1934428
  • 19,864
  • 7
  • 42
  • 87
0

Perl One Liner
what you want
find . | perl -a -F/ -lne 'print $F[-1] if /.*.ipynb/g'

Then not your code
what you do not want
find . | perl -a -F/ -lne 'print $F[-1] if !/.*.ipynb/g'

NOTE
In Perl you need to put extra .. So your pattern would be .*.ipynb

Shakiba Moshiri
  • 21,040
  • 2
  • 34
  • 44
0

If you don't know that the extension is or there are multiple you could use this:

find . -type f -exec basename {} \;|perl -pe 's/(.*)\..*$/$1/;s{^.*/}{}'

and for a list of files with no duplicates (originally differing in path or extension)

find . -type f -exec basename {} \;|perl -pe 's/(.*)\..*$/$1/;s{^.*/}{}'|sort|uniq
Diego
  • 812
  • 7
  • 25
0

Another easy way which uses basename is:

find . -type f -iname '*.ipynb' -exec basename -s '.ipynb' {} +

Using + will reduce the number of invocations of the command (manpage):

-exec command {} +

This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending each selected file name at the end; the total number of invocations of the command will be much less than the number of matched files. The command line is built in much the same way that xargs builds its command lines. Only one instance of '{}' is allowed within the command, and (when find is being invoked from a shell) it should be quoted (for example, '{}') to protect it from interpretation by shells. The command is executed in the starting directory. If any invocation with the `+' form returns a non-zero value as exit status, then find returns a non-zero exit status. If find encounters an error, this can sometimes cause an immediate exit, so some pending commands may not be run at all. For this reason -exec my-command ... {} + -quit may not result in my-command actually being run. This variant of -exec always returns true.

Using -s with basename runs accepts multiple filenames and removes a specified suffix (manpage):

-a, --multiple

support multiple arguments and treat each as a NAME

-s, --suffix=SUFFIX

remove a trailing SUFFIX; implies -a

bentocin
  • 441
  • 6
  • 13
-1

If there's no occurrence of this ".ipynb" string on any file name other than a suffix, then you can try this simpler way using tr:

find . -type f -iname "*.ipynb" -print | tr -d ".ipbyn"
niglesias
  • 437
  • 7
  • 16
  • Simplest answer most useful most of the time. – Leo Oct 02 '20 at 06:17
  • 1
    This is a bad answer because `tr` doesn't care of the characters are in order, it will delete all occurrences of any single of those characters. Example: `echo snipsnap | tr -d ".ipbyn"` => `ssa` – stefansundin Dec 09 '22 at 00:53