2

I'm using Regex and TextPad to clean up and prep sql scripts. I want to replace with comma or add comma:

  • 1 or more spaces at the end of each line
  • at end of each line or the last line (i.e. "end of file")

After spending a few hours researching and iterating I came up with the following which is close to the desired result I want but not perfect. How do I edit this to get the below desired result?

Find what: ' +$|(\n)|$(?![\r\n])'

Replace with: '\1\2,'

I have data that looks like

       dog  *(2 spaces)*
        cat    *(4 spaces)*
        bird*(no space)*
       rodent *(1 space)*
      fish*(no space)*

I want the result to be

    dog,
    cat,
    bird,
    rodent,
    fish,

My result is

        dog,
         cat,
         bird
    ,     rodent,
         fish,
bobble bubble
  • 16,888
  • 3
  • 27
  • 46
daniellopez46
  • 594
  • 3
  • 7
  • 17

2 Answers2

3

In textpad or notepad++ \s*$ will work but it's worth to mention that if using this in another environment it can lead to undesired matches (regex101) and add an extra comma if there are spaces at the end of the line. The reason is that for example in cat it will match the spaces after cat (first match) plus a zero-length match at end of the line (second match).

Another potential issue of \s*$ can be read here: The RegEx that killed StackOverflow (blog)

If there are many spaces inside the text it can lead to a lot of backtracking (regex101 demo). This demo input needs about 7k steps just to remove some spaces at the end. A workaround to reduce steps can be to consume and capture the part up to the last non-white-space (if there is one).

^(.*\S)?\s*$

Replace with $1, (regex101 demo) what's captured by the first group which is set to optional for even matching only whitespace. This would get it down to a bit more than 100 steps.

bobble bubble
  • 16,888
  • 3
  • 27
  • 46
2

I think you're overcomplicating it. Just match any number of spaces at the end of the line, and replace with comma.

Find: \s*$ Replace with: ,

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • 1
    Just to mention this [can be inefficent](https://mamchenkov.net/wordpress/2016/07/21/the-regex-that-killed-stackoverflow/) and depending on lang/tool result in [undesired matches](https://regex101.com/r/qbbXgt/1). An idea to prevent it can be to [replace `^(.*\S)?\s*$` with `$1`](https://regex101.com/r/qbbXgt/3), (capturing up to last non-white-space). – bobble bubble May 26 '23 at 10:57
  • You should post that as a new answer, with a full explanation of why mine fails and this is better. – Barmar May 26 '23 at 14:15