-2

Lets say I have this string:

,"mimeType":"video/mp4;+codecs=\"avc1.42001E,+mp4a.40.2\"","bitrate":353051,"width":640,"height":320,"lastModified":"1543659519195688","contentLength":"24469560","quality":"medium","fps":24,"qualityLabel":"360p","projectionType":"RECTANGULAR","averageBitrate":35300;codecs=\"avc1.64001F,+mp4a.40.2\"","bitrate":987359,"width":1280,"height":640,"lastModified":"1543660211977003","quality":"hd720","fps":24,"qualityLabel":"720p","projectionType":"RECTANGULAR

I need to extract all the pair values for width and height which are right after a "bitrate" string.

Notice that the string has width, height twice. I need to get both pairs:

"bitrate":353051,"width":640,"height":320

"bitrate":987359,"width":1280,"height":640

Notice that I only need to get the ones which are immediately followed by "bitrate" value. If there's no "bitrate" value right before, then I don't want it.

So the answer should be:

640,320
1280,640

I have pasted the string here:

https://regex101.com/r/VXAyvV/2

2 Answers2

1

You can use this regex, which will only match width and height when they immediately follow a value for bitrate:

"bitrate":\d+,"width":(\d+),"height":(\d+)

The width and height will be captured in group 1 and 2 respectively.

Demo on regex101

Nick
  • 138,499
  • 22
  • 57
  • 95
1

You can use the following regular expression.

"bitrate":(?=.*?"width":(?<width>\d+))(?=.*?"height":(?<height>\d+))

Demo

I tested it with the PCRE (PHP) regex engine, but it should work with all engines that support positive lookaheads, which is most regex engines.

I used named capture groups but that is not necessary as the width is always captured first (after "bitrate" is matched), even if it follows the height in the string. There can be intervening fields between "bitrate" and "width" and "bitrate" and "height".

Consider the following string.

"qual":"360p","bitrate":987359,"wt":90,"width":1280,"fps":24,"height":640,"last":"154"

When matched against the regex the capture group named width will hold "1280" and the capture group height will hold "640".

The regex engine performs the following operations.

"bitrate":       match '"bitrate":'
(?=              begin a positive lookahead
  .*?            lazily match 0+ chars other than newlines
  "width":       match '"width":'
  (?<width>\d+)  match 1+ digits and save to capture group 'width'
)                end positive lookahead
(?=              begin a positive lookahead
  .*?            lazily match 0+ chars other than newlines
  "height":      match '"height":'
  (?<height>\d+) match 1+ digits and save to capture group 'height'
)                end positive lookahead
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100