2

I have a string like this patern: "abcd abcd | abcde | Degree SP | xyz abcd | abcd ABC"

I need to extract "Degree SP" using regular expressions. How can I do that? The condition here are:

  • string end with "SP "
  • string start after last "|".

I'm trying the Google Sheet formula REGEXEXTRACT(<input string>, "[\|\s].+SR[\s\|]") It returns " | abcde | Degree SP ". How can I restrict to extract from the last "|"?

Amir Hossain
  • 144
  • 10
  • Thanks for showing your efforts, does `Degree SP` really have `**` before and after it, I have done editing in your question to remove quotes to code tags, kindly do confirm OR edit your samples to make it more clear, thank you. – RavinderSingh13 Nov 17 '21 at 06:34
  • 1
    Thank you @RavinderSingh13 for notifying me about **, it was an error. I just edit the post and remove those. – Amir Hossain Nov 17 '21 at 13:09

2 Answers2

2

If the string Degree SP should be between pipes and a space:

\|\s([^\s|][^|]*SP)\s\|
  • \|\s Match | and a whitespace char
  • ( Capture group 1
    • [^\s|] Match a single char other than a space or |
    • [^|]*SP Match optional chars other than | and match SP
  • ) Close group 1
  • \s\| Match a whitespace char and |

Regex demo

enter image description here

If only the pipe after Degree SP is mandatory:

([^\s|][^|]*SP)\s*\|

Regex demo

enter image description here

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • Thank you @The fourth bird Your solution is just awesome. It solved the problem. That saves a ton of my time. I have another problem relating to this one. String pattern is " | | | Word1 Word2 | | ". This case pattern is static till the target area "Word1 Word2". I tried with "^.*?\|.*?\|.*?\|.*?([^\\|]\S+)" and get the "Word1". But I need "Word2". And sometimes if this string has the only word. – Amir Hossain Nov 17 '21 at 13:18
  • @Swopno If you want to capture the last word between the pipes `=REGEXEXTRACT(A1,"^[^|\n]*\|[^|\n]*\|[^|\n]*\|(?:\s*([^|\s]+))+")` See this [regex demo](https://regex101.com/r/cahlfU/1) – The fourth bird Nov 17 '21 at 13:31
  • 1
    Thank you. Actually, I also looking for the option (tick mark). – Amir Hossain Nov 17 '21 at 13:43
1

With your shown samples, please try following regex.

^.*?\s+\S+\s+\|\s+\S+\s+\|\s+([^\\|]*)\s+\|.*$

Online demo for above regex

OR you want to catch value between 2nd and 3rd occurrence of | which ends with SP string then try following regex:

^.*?\s+\S+\s+\|\s+\S+\s+\|\s+([^\\|]*SP)\s+\|.*$

Online demo for above regex

Explanation: Adding detailed explanation for above.

^.*?\s+\S+\s+  ##Matching from starting of value with a lazy match till 1st occurrence of spaces followed by 1 or more non-spaces followed by 1 or more spaces.
\|\s+\S+\s+\|  ##Matching |(literal) followed by spaces followed by 1 or more non-spaces followed by spaces with |(literal character) here.
\s+            ##Matching 1 or more spaces occurrences here.
([^\\|]*)      ##Creating 1 and only capturing group which has everything till next occurrence of | to get Degree SP value mentioned by OP in samples.
\s+\|.*$       ##Matching 1 or spaces followed by | till last of value/line.
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
  • Thank you for your answer. But the string pattern is vary depending on the situation. Occurrence of "|" very in different situations. Actually, I need a solution that defines the position based on the string "SP |". Hope that makes sense. – Amir Hossain Nov 17 '21 at 07:44
  • 1
    @Swopno, ok could you please try following regex once `\b(\S+\s+SP)\s+\|` and let me know if this helps you? – RavinderSingh13 Nov 17 '21 at 08:09