1

I am trying to extract blocks of JSON data from a data stream in the following format:

    Some-Header-Name:Value
    Content-Length:Value
    Some-Other-Header:Value

    {JSON data string of variable length}

The stream contains many instances of the above pattern and the length of JSON data in each instance is different, as indicated by the preceeding Content-Length header.

I wish to create a Regex that matches each of the content length header values and uses it to match the associated content block. I envisage something like this ...

    Content-Length:(?<LENGTH>\d+).*?\r\n\r\n(?<CONTENT>.{$<LENGTH>})

... but I'm not sure how to specify the quantifier for the CONTENT group as a dynamic value.

Note: although the headers are on separate lines and the content is separated from the headers by a blank line, there is no linefeed after the content, so it is not possible to use this to determine the end of content.

Any suggestions would be appreciated.

Thanks, Tim

kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
Tim Coulter
  • 8,705
  • 11
  • 64
  • 95

1 Answers1

1

Regular expressions match strings, not numbers, and therefore they can't take a part of the string, convert it to a number, and reapply it within the same regex.

You'd have to do it in several steps:

  1. Match the header, extract the length value
  2. Build a new regex like @"(?<HEADER>...)(?<CONTENT>.{" + length + "})"
  3. Reapply that regex and extract the contents.
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • Thanks - I guess I was expecting too much. I can see that your approach will work (hence I have accepted your answer), but I was hoping that Regex would offer something that would extract many matches in a single operation. – Tim Coulter Oct 06 '10 at 17:52