1

I am trying to see if I can match the gpgsig using the regex below, but ran into an error also shown below.

Is there any guidance on how to fix it?

import re

if __name__ == '__main__':
    log = '''
tree e76fa5ccd76492d843b6a4a06038d1c3b5aef6f8
parent 0d533a3a5fd51fd8c2x932832ef9ea91d0756c18
author firstname lastname <userid@company.com> 1676061999 -0800
committer firstname lastname <userid@company.com> 1676061999 -0800
gpgsig -----BEGIN SIGNED MESSAGE-----
 MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0B
 BwEAAKCCAuswggLnMIICjKADAgECAhANVjmYTunVjjNs9EhuJ4YXMAoGCCqGSM49
 BAMCME0xKTAnBgNVBAMMIEFmorplIENvcnBvcmF0ZSBTaWduaW5nIEVDQyBDQSAx
 MRMwEQYDVQQKDApBcHBsZSBJbmMuMQswCQYDVQQGEwJVUzAeFw0yMzAyMDkxOTU2
 NTlaFw0yMzAzMDIyMDA2NTlaMDIxEzARBgNVBAoMCkFmorplIEluYy4xGzAZBgNV
 BAMMEmduYWtrYWxhQGFmorplLmNvbTBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IA
 BGwmvh7HYXCyerdERaLr+OOJ3AQxYNSfUorWkROO2xv/ra8yYGL/aBCYJSQUoYRY
 kY4GE90s8NAUwmQmsthdbFSjggFnMIIBYzAMBgNVHRMBAf8EAjAAMB8GA1UdIwQY
 MBaAFEJi3AGoy1MCpVzt8IjG9uFJdhE9MHMGCCsGAQUFBwEBBGcwZTAvBggrBgEF
 BQcwAoYjaHR0cDovL2NlcnRzLmFmorplLmNvbS9hY3NlY2NhMS5kZXIwMgYIKwYB
 BQUHMAGGJmh0dHA6Ly9vY3NwLmFmorplLmNvbS9vY3NwMDMtYWNzZWNjMTA0MB0G
 A1UdEQQWMBSBEmduYWtrYWxhQGFmorplLmNvbTAUBgNVHSUEDTALBgkqhkiG92Nk
 BBQwMgYDVR0fBCswKTAnoCWgI4YhaHR0cDovL2NybC5hcHBsZS5jb20vYWNzZWNj
 YTEuY3JsMB0GA1UdDgQWBBR1dRRNvQ/7RwRTorG97HmKR4xoJjAOBgNVHQ8BAf8E
 BAMCB4AwJQYDVR0gBB4wHDAMBgoqhkiG92NkBRQBMAwGCiqGSIb3Y2QFFAIwCgYI
 KoZIzj0EAwIDSQAwRgIhAPQ4IiaCG6V5A7u0lwbhJxyXHf9jN2IoqRLj7BlFo4Uv
 AiEAtJAekfgFoiE3h8ZZDgvhwRiwPJseo8GDfM0tb5DP0h8xggE3MIIBMwIBATBh
 ME0xKTAnBgNVBAMMIEFmorplIENvcnBvcmF0ZSBTaWduaW5nIEVDQyBDQSAxMRMw
 EQYDVQQKDApBcHBsZSBJbmMuMQswCQYDVQQGEwJVUwIQDVY5mE7p1Y4zbPRIbieG
 FzANBglghkgBZQMEAgEFAKBpMBgGCSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJ
 KoZIhvcNAQkFMQ8XDTIzMDIxMDIwNDY1MFowLwYJKoZIhvcNAQkEMSIEIP8j8iYG
 Ggpc74AeVdxLkIArVBLw3+vw6/FVmGtNig+uMAkGByqGSM49AgEERjBEAiB0dBI3
 9c1b/bsStaT3blWb19ehQDt8J/NNov/TzSgEzAIgWvpSs/DZI7wmlHtIJ8HpmIp4
 +oNOu4kJJlhtUy9ZImUAAAAAAAA=
 -----END SIGNED MESSAGE-----

'''

pattern = "gpgsig -----BEGIN SIGNED MESSAGE------{3,}$(?s).*?^-{3,} -----END SIGNED MESSAGE-----"

if re.search(pattern,log):
    print ("Found a match")

Here is the error:

/Users/Documents/pythonscripts/test.py:40: DeprecationWarning: Flags not at the start of the expression 'gpgsig -----BEGIN SI' (truncated)
  if re.search(pattern,log):
Gino Mempin
  • 25,369
  • 29
  • 96
  • 135
user3508811
  • 847
  • 4
  • 19
  • 43
  • 1
    Does this answer your question? [Regex expressions - Deprecation warning](https://stackoverflow.com/questions/72284064/regex-expressions-deprecation-warning) – Maximilian Ballard Feb 11 '23 at 02:51
  • It's not clear what you expect as an answer here. If your question is about the _warning_ message, then that's explained in [the proposed related post](https://stackoverflow.com/questions/72284064/regex-expressions-deprecation-warning). It's a _deprecation warning_ , which shouldn't be directly related to the matching of the pattern. Or, is your question about _not finding a match_ given your regex pattern? Then that's a different thing, indicating your regex pattern is wrong. – Gino Mempin Feb 11 '23 at 03:01

1 Answers1

0

As previously commented, the DeprecationWarning is not an error but rather a warning, and under normal circumstances should not stop code execution (so I'll address it only later). Assuming the desired result is to match the log variable as provided, there are two problems with the pattern; matching dashes, and using the start/end (^ and $) markers.

The amount of dashes tested by the pattern exceeds the amount of dashes in the log variable due to the curly-bracket syntax, so no match is found. Effectively, the pattern gpgsig -----BEGIN SIGNED MESSAGE------{3,} would only match similar text where the amount of dashes after MESSAGE is 8 or higher (where in log there are only 5 dashes):

>>> pattern = "gpgsig -----BEGIN SIGNED MESSAGE------{3,}"
>>> re.search(pattern, "gpgsig -----BEGIN SIGNED MESSAGE-------")
>>> re.search(pattern, "gpgsig -----BEGIN SIGNED MESSAGE--------")
<re.Match object; span=(0, 40), match='gpgsig -----BEGIN SIGNED MESSAGE--------'>
>>> re.search(pattern, "gpgsig -----BEGIN SIGNED MESSAGE-----------------")
<re.Match object; span=(0, 49), match='gpgsig -----BEGIN SIGNED MESSAGE----------------->

Additionally, to facilitate the use of start/end markers, the MULTILINE flag should be provided.
Adjusting the code as follows should match the contents in log:

pattern = "gpgsig -{3,}BEGIN SIGNED MESSAGE-{3,}$(?s).*?^ -{3,}END SIGNED MESSAGE-{3,}"

if re.search(pattern, log, flags=re.MULTILINE):
    print("Found a match")

Execution result (note that while the warning is still emitted, now a match is actually found):

$ python /tmp/test.py
/tmp/test.py:40: DeprecationWarning: Flags not at the start of the expression 'gpgsig -{3,}BEGIN SI' (truncated)
  if re.search(pattern, log, flags=re.MULTILINE):
Found a match

As for the emitted warning, it relates to the inline modifier group (?s), as its python implementation/interpretation might not yield the expected result. I can think of 3 ways to avoid this warning, but each one should be carefully examined to see if it fits your use-case (if at all);

  1. Limit the inline modifier to a specific pattern:
pattern = "gpgsig -{3,}BEGIN SIGNED MESSAGE-{3,}$(?s:.*)?^ -{3,}END SIGNED MESSAGE-{3,}"
  1. Move the inline modifier to the start of the pattern:
pattern = "(?s)gpgsig -{3,}BEGIN SIGNED MESSAGE-{3,}$.*?^ -{3,}END SIGNED MESSAGE-{3,}"
  1. Use the appropriate re flags (re.S or re.DOTALL) instead of the inline modifier:
pattern = "gpgsig -{3,}BEGIN SIGNED MESSAGE-{3,}$.*?^ -{3,}END SIGNED MESSAGE-{3,}"
if re.search(pattern, log, flags=re.MULTILINE|re.DOTALL):
    print("Found a match")
micromoses
  • 6,747
  • 2
  • 20
  • 29
  • I want an exact match of `gpgsig -----BEGIN SIGNED MESSAGE-----` and `-----END SIGNED MESSAGE-----` ,how does regex change for this? – user3508811 Feb 12 '23 at 02:02