0

I have this string:

{"TimePeriod": {"Start": "2017-03-01", "End": "2017-04-01"}, "Total": {"UnblendedCost": {"Amount": "2942.25119998", "Unit": "USD"}, "UsageQuantity": {"Amount": "20835", "Unit": "Hrs"}}, "Groups": [], "Estimated": false},
{"TimePeriod": {"Start": "2017-04-01", "End": "2017-05-01"}, "Total": {"UnblendedCost": {"Amount": "2982.62609983", "Unit": "USD"}, "UsageQuantity": {"Amount": "21049", "Unit": "Hrs"}}, "Groups": [], "Estimated": false},
{"TimePeriod": {"Start": "2017-05-01", "End": "2017-06-01"}, "Total": {"UnblendedCost": {"Amount": "1399.04829988", "Unit": "USD"}, "UsageQuantity": {"Amount": "23010", "Unit": "Hrs"}}, "Groups": [], "Estimated": false},
{"TimePeriod": {"Start": "2017-06-01", "End": "2017-07-01"}, "Total": {"UnblendedCost": {"Amount": "962.47549987", "Unit": "USD"}, "UsageQuantity": {"Amount": "20049", "Unit": "Hrs"}}, "Groups": [], "Estimated": false}

I am working on a regex to split the above string into multiple records, e.g: each record will look like:

{"TimePeriod": {"Start": "2017-06-01", "End": "2017-07-01"}, "Total": {"UnblendedCost": {"Amount": "962.47549987", "Unit": "USD"}, "UsageQuantity": {"Amount": "20049", "Unit": "Hrs"}}, "Groups": [], "Estimated": false}

My current approach is

(\{\"TimePeriod\":){1}.+(false\}){1}

But this will match the entire string instead of matching each record, I think the solution should be something with the lookahead in regex to ensure the TimePeriod appears just once in the matched string but I don't know how to do it. Any pointers will be appreciated.

*There is no newline between each line, I just put it there for the presentation

Forrest
  • 723
  • 2
  • 8
  • 24

2 Answers2

1

This seems work for your need. I just only slightly changed your regex to lazy searching mode.+? from greed .+

(\{\"TimePeriod\":){1}.+?(false\}){1}

Demo

And if some more modification is added, it would be

(\{\"TimePeriod\":).+?(false\})

Another method using lookahead,

(\{\"TimePeriod\":)(?:(?!false).)+(false\})
Thm Lee
  • 1,236
  • 1
  • 9
  • 12
0

You may split on the following lookaround:

,(?=\{"TimePeriod":)

Demo

The logic basically says to split at a point where there is a comma immediately followed by the text {"TimePeriod":. Note that this means that there would not be a split at the very start of the text because there is no comma.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Hi Tim, is there anyway that we could eliminate the comma "," at the end of each line with regex? I am currently working with Sumologic, so there is no other way around rather than using regex – Forrest Apr 06 '18 at 01:49
  • Tell me how you are using this regex. Are you looking to do a replacement? – Tim Biegeleisen Apr 06 '18 at 01:51
  • So in Sumologic, this string comes from a log file and I need to use a regex to split the string into multiple records in json format so that I could aggregate over them. You can read this for an example: https://support.sumologic.com/hc/en-us/community/posts/206795357-Is-it-possible-to-split-a-JSON-list-into-multiple-records- – Forrest Apr 06 '18 at 02:01
  • @Forrest I gave you an update which should meet your requirements. – Tim Biegeleisen Apr 06 '18 at 02:18