1

In Dynatrace, there are the URLs which are containing the a word which is dynamic. Want to remove that dynamic word from the URL using regex

Below are the different urls

  • /aaa/fdsadx/drtyu/ab_cd/myword?Id=953
  • /asd/XXXXX/sadsa/two/xx_yy?Id=953
  • /asd/fdsadx/df/three/pp_qq/myword
  • /asd/fdsadx/sadsa/ab_cd
  • /SSS/fdsadx/cvnm/forth/gg_hh

Expected output

  • /asd/fdsadx/sadsa//myword?Id=953
  • /asd/fdsadx/sadsa/?Id=953
  • /asd/fdsadx/sadsa//myword
  • /asd/fdsadx/sadsa/

I'm able to manage this regex

(\S+?)ab_cd(.*)

But its not working for dynamics values and all URL. How Can I improve the regex to to remove the dynamic value?

Nitin D
  • 49
  • 1
  • 5

2 Answers2

2

You could use the 2 capturing groups and match the underscore part after matching a forward slash

^(\S+/)[^\s_]+_[^\s_/?]+(.*)
  • ^ Start of string
  • (\S+/) Capture group 1, match 1+ times a non whitespace char followed by /
  • [^\s_]+ Match 1+ times any char except a whitespace char or _
  • _ Match literally
  • [^\s_/?]+ Match 1+ times any char except a whitespace char, _, / or ?
  • (.*) Capture group 2 Match 0+ times any char except a newline

Regex demo

In the replacement use the 2 capturing groups, for example $1$2

If you want to match country codes and you know that they for example consist of chars a-zA-Z you could make the character class more specific

^(\S+/)[A-Za-z]+_[A-Za-z]+(.*)

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
1

It seems that the first portion is fixed, and you're trimming everything after a '/' or '?'. Given that, perhaps you want something like:

s/(\/asd\/fdsadx\/sadsa\/)[^/?]+(.*)/\1\2/

This will capture the head in \1, ignore a group of characters that are not either '\' or '?', and capture the tail in \2.

Gerb
  • 486
  • 2
  • 6
  • No first part is also varies like /aaa/fdsadx/drtyu/ab_cd/myword?Id=953 /asd/XXXXX/sadsa/two/xx_yy?Id=953 /asd/fdsadx/df/three/pp_qq/myword /SSS/fdsadx/cvnm/forth/ab_cd – Nitin D Sep 30 '19 at 14:36