2

I want to exclude any files ending with '.ses' or files with no extension using the following regex pattern. It works fine in command line but not in a shell (bash/ksh).

Regex pattern: "\.(?!ses\$)([^.]+\$)"

File name examples:

"/test/path/test file with spaces.__1" (expected true)
"/test/path/test file with spaces.ses" (expected false)
"/test/path/test file with spaces" (expected false)
"/test/path/test file with spaces.txt" (expected true)
FILE_NAME="/test/path/test file with spaces.__1"

PATTERN_STR="\.(?!ses\$)([^.]+\$)"

if [[ "${FILE_NAME}" =~ ${PATTERN_STR} ]]; then

        Match_Result="true"
else

        Match_Result="false"

fi

echo $Match_Result

it returns "true" but "false" in the shell. Anyone knows why?

Gill
  • 23
  • 3
  • Which shell are you using? Is your script and your cli using the same shell? – Allan Wind Nov 23 '21 at 03:25
  • I tested your script in my shell and I got an error for using double quotes. The problem is the "!" sign in your string, this must be in single quotes. Try changing PATTERN_STR="\\\.(?!ses\$)([^.]+\$)" to PATTERN_STR='\\\.(?!ses\$)([^.]+\$)'. What I did was change the double quotes " " to single quotes ' '. – Edgar Magallon Nov 23 '21 at 03:29
  • 2
    `(?!...)` is PCRE. Bash implements "extended regular expressions" like grep -E. What was you "command line" command you were using? – glenn jackman Nov 23 '21 at 03:58
  • @Gill : I don't know what kind of regexp is accepted by ksh, but it does not look like a meaningful bash regexp. Can you demonstrate that it would work inside your script? For instance, in your question I don't see a complete script, nor do I see how you have invoked the script. – user1934428 Nov 23 '21 at 05:58
  • The correct regex would look like this: `[[ "${FILE_NAME##*/}" =~ \.ses$|^[^.]+$ ]] && exclude=yes`. You need to strip any leading path. But I would generally use a case statement, like Allan's answer. It's a lot clearer. – dan Nov 23 '21 at 06:21
  • Thanks everyone. ${PATTERN_STR} is passed as a parameter so it's a bit inflexible. I like Allan's answer if the case statement can be constructed using input parameters maybe using eval? – Gill Nov 25 '21 at 01:34

2 Answers2

1

I would just use a case statement with suitable globs:

case "${FILE_NAME##*/}" in
*.ses)
    Match_Result=false
    ;;
*.*)
    Match_Result=true
    ;;    
*)
    Match_Result=false
    ;;
esac

Consider using an array instead of doing whitespace gymnastics.

Allan Wind
  • 23,068
  • 5
  • 28
  • 38
  • @JohnKugelman fixed with a hack. – Allan Wind Nov 23 '21 at 03:29
  • Good answer, but you should strip any leading path. `FILE_NAME` was a full path, and OP is interested in file names only. `${FILE_NAME##*/}` or `$(basename "$FILE_NAME")` – dan Nov 23 '21 at 06:25
  • @dan updated as suggested. – Allan Wind Nov 23 '21 at 06:29
  • Thanks @AllanWind for alternative solution. Is it possible to construct case statements using input parameters like using eval? – Gill Nov 25 '21 at 03:51
  • Yes, both case word and the pattern are pattern expanded. See also `shopt extglob` which enables `!(pattern-list)` to matches anything except one of the given patterns. – Allan Wind Nov 25 '21 at 04:46
0

You can reverse your logic and fail all strings that contain .ses at the end or do not contain a dot after the last /.

Then, you can use this script:

#!/bin/bash
declare -a arr=("/test/path/test file with spaces.__1"
"/test/path/test file with spaces.ses"
"/test/path/test file with spaces"
"/test/path/test file with spaces.txt")
# true false false true
PATTERN_STR='(/[^/.]+|\.ses)$'
for FILE_NAME in "${arr[@]}"; do
  if [[ "$FILE_NAME" =~ $PATTERN_STR ]]; then
    Match_Result="false"
  else
    Match_Result="true"
  fi
  echo $Match_Result
done;

Output:

true
false
false
true

Details:

  • ( - start of a capturing group:
    • /[^/.]+ - / and then one or more chars other than / and .
  • | - or
    • \.ses - .ses
  • ) - end of grouping
  • $ - end of a string.

A case insensitive version is enabled with shopt -s nocasematch/shopt -u nocasematch (see Case insensitive regex in Bash).

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thank you Wiktor, it solved my problem! Just a question though, is it hard to get it working without reversing? – Gill Nov 25 '21 at 04:20
  • 1
    Good job including test data. Suggest removing the ! and swapping the two branches of the if statement. – Allan Wind Nov 25 '21 at 04:53
  • @Gill It is hard to do with a single POSIX ERE regex though it is possible. It might look like `/[^/.]+\.([^/s][^/][^/]|[^/][^/e][^/]|[^/][^/][^/s]|.{1,2}|.{4,})$`, see [demo](https://regex101.com/r/Ex2yo9/2). – Wiktor Stribiżew Nov 25 '21 at 07:57
  • Ahh, then it's no brainer, thanks! :) – Gill Nov 25 '21 at 10:48