3

What should the regular expression look like when I want to find and remove the part that begins with a dash (-) and everything after in the Article number column?

I'm using BBEdit to search and replace the strings in a tab delimited CSV-file (sample below).

"Article number"    "Name"  "Third Column"
"Shorts Artic"  "Swa..."    "2018-07-28"
"Shorts Artic-1"    "Swa..."    ""
"Shorts Artic-2-1"  "Swa..."    "https://test-domain.com/..."
"Shorts Artic-2-2-1"    "Sw..." ""
"Shorts Artic-2-2-2-2-1"    "Ba..." "-asd"
"Shorts Artic-2-2-2-2-2-1"  "Nus..."
"Shorts Artic-2-2-2-2-2-1-1"    "Lek.."
"0858-1"    "Jacket Blue.."
"0858-2-1"  "Jacket Re.."
"0858-2-2-1"    "Int..."
"0858-2-2-2-1"  "In..."
"0858-2-2-2-2-1"    "Int..."
"0858-2-2-2-2-2-1"  "Int..."
"0858-2-2-2-2-2-2-1"    "Int..."
"0858-2-2-2-2-2-2-2-1"  "In..."
"0858-2-2-2-2-2-2-2-2"  "In..."
"0858-2-2-2-2-2-2-2-1"  "In..." "6 107-124 cm"
"stl 31-35-1-1-1-1-2-2-2-1-1"   "In..."

"Shorts Artic-1" would turn into "Shorts Artic"

"Shorts Artic-2-2-1" would turn into "Shorts Artic"

"0858-2-2-2-2-2-2-2-2" would turn into "0858"

petsk
  • 414
  • 7
  • 16

1 Answers1

7

You can use this pattern:

("[a-zA-Z 0-9]+)(?:-\d)+(?=")
  • ("[a-zA-Z 0-9]+) Match and capture ", alphabetic characters, whitespace and digits.
  • (?:-\d)+ Non capturing group. Match and capture - and digits repeatedly.
  • (?=") Positive lookahead for ".

Replace with:

 \1

You can try the pattern here.


For the updated text file, you can use:

^"((?:[a-zA-z]+ ?)+|[0-9]+)(?:-?\d)+(?=")

Replacing with:

"\1

You can try the pattern here.

Paolo
  • 21,270
  • 6
  • 38
  • 69
  • Very close. But it's not matching "stl 31-35-1-1-1-1-2-2-2-1-1" and if there's a date in the second/third column it's matching the date "2018-07-28". I updated my sample data. – petsk Aug 16 '18 at 21:40
  • @petsk No problem, but I don't like leaving problems unsolved :P [See the second pattern in the updated answer](https://regex101.com/r/fb5rEY/4). – Paolo Aug 16 '18 at 21:58