I download some webpage with JSON code embedded into javascript. I need to decode it but it is incorrect JSON and includes single and double quotes which cause error at decode subroutine.
NOTE: JSON extracted as a block into string variable, DATA block represents some form of incorrect JSON code (mostly the problem is in a part which represents input of website visitor clients), JSON has quite deep recursion structure.
So far I could not find better solution than attached bellow code which is still incorrect.
Is there a better way to doctor received JSON code? [May be with (??{ code}) in regex]
use strict;
use warnings;
use diagnostics;
while( <DATA> ) {
chomp;
print "IN: $_\n";
s/"/'/g;
print "OUT: $_\n" if s/'(.*?)'\s*:\s*'(.*?)'(,|\s*\})/"$1": "$2"$3/g;
}
__DATA__
{ "d1": "some data here", "d2":"some "data" here", "d3": "some "data" here "year"", "d4": { "x1": "some "data" here" } }
{ "d2": "some data here", "d2":"some "data" here", "d3": "some "data" here "year"" }
{ 'd3': 'some data here', "d2":"some "data" here", "d3": "some "data" here "year"" }
{ "d4": 'some data here', "d2":"some "data" here", "d3": "some "data" here "year"", "d4": { "x1": "some "data" here" } }
{ 'd5': "some data here", "d2":"some "data" here", "d3": "some "data" here "year"" }
output
IN: { "d1": "some data here", "d2":"some "data" here", "d3": "some "data" here "year"", "d4": { "x1": "some "data" here" } }
OUT: { "d1": "some data here", "d2": "some 'data' here", "d3": "some 'data' here 'year'", "d4': { 'x1": "some 'data' here" } }
IN: { "d2": "some data here", "d2":"some "data" here", "d3": "some "data" here "year"" }
OUT: { "d2": "some data here", "d2": "some 'data' here", "d3": "some 'data' here 'year'" }
IN: { 'd3': 'some data here', "d2":"some "data" here", "d3": "some "data" here "year"" }
OUT: { "d3": "some data here", "d2": "some 'data' here", "d3": "some 'data' here 'year'" }
IN: { "d4": 'some data here', "d2":"some "data" here", "d3": "some "data" here "year"", "d4": { "x1": "some "data" here" } }
OUT: { "d4": "some data here", "d2": "some 'data' here", "d3": "some 'data' here 'year'", "d4': { 'x1": "some 'data' here" } }
IN: { 'd5': "some data here", "d2":"some "data" here", "d3": "some "data" here "year"" }
OUT: { "d5": "some data here", "d2": "some 'data' here", "d3": "some 'data' here 'year'" }