0

I have something like this

a:2:{s:4:"Test";s:29:"asdf'a\' ;"serialize here;?"!";s:5:"test2";a:1:{s:4:"Test";s:15:"serialize here!";}}

and I want to make a regex in order to match both examples below:

s:15:"serialize

s:29:"asdf'a\' ;"serialize

How can I do that?

I tried something like s:\d+:".+?serialize

But it doesn't work.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • 1
    Why not unserialize it and work with that? – Barmar Jun 17 '23 at 15:23
  • Your regexp works for me: https://ideone.com/9URK80 – Barmar Jun 17 '23 at 15:27
  • @Barmar because thats an example, the real data are more complex and too large to unserialize and serialize – chris tsironis Jun 17 '23 at 16:51
  • 1
    Please provide an example that shows the problem. – Barmar Jun 17 '23 at 16:52
  • also when i use the regex that i made it gives me `s:4:"Test";s:29:"asdf'a\' ;"serialize` but i want `s:29:"asdf'a\' ;"serialize` – chris tsironis Jun 17 '23 at 16:53
  • This is not something that's easy to do with regexp. You don't want it to match across other `s:`, but a regexp can't tell the difference between that inside a string and outside a string. – Barmar Jun 17 '23 at 16:58
  • Why do you want to use a regex and not [unserialize](https://www.php.net/manual/en/function.unserialize.php) the data first? See this page for [Structure of a Serialized PHP string](https://stackoverflow.com/questions/14297926/structure-of-a-serialized-php-string) – The fourth bird Jun 18 '23 at 15:45

2 Answers2

1

You can use the following.

Granted, there must be no colon, :, between the s:\d: and "serialize".

s:\d+:[^:]+serialize

Matches

s:29:"asdf'a\' ;"serialize
s:15:"serialize
Reilas
  • 3,297
  • 2
  • 4
  • 17
  • Yes unfortunately sometimes the colon also exists as part of the string... I don't think I can use any specific character, only a compilation of characters like 's:' ...... but when I try this it doesn't work as expected s:\d+:[^(s:)]+serialize – chris tsironis Jun 18 '23 at 05:32
  • @christsironis, so, the colon would be within two double quotation marks? Additionally, can the text contain a double quotation mark if it is escaped? For example, _s:29:"asdf'a\' : \"example\" ;"serialize_. – Reilas Jun 19 '23 at 03:08
0

You can use:

\bs:\d+:"(?:(?!\bs:\d).)*?serialize

A couple notes regarding you attempt:

  • s:\d+ would also match something like "likes:123". To avoid this I added \b,
  • .+? matches at least one symbol, and in your example one of the expected outputs has zero. Also, lazy modifier works only for the right side of matching: this is the reason it matched form the first s:\d+ till the serialize and not nearest pair.

To match closes pair of s:\d+ and serialize I'm using (?:(?!\bs:\d).)*?. It matches smallest possible block, that doesn't contain s:\d in between.

Demo here.

markalex
  • 8,623
  • 2
  • 7
  • 32