2

We have some bad json we're trying to parse. Unfortunately it isn't valid json as it returns unquoted NaN's in the payload.

We are switching from the long deprecated request library to axios. This seems to have doubled our memory usage of our current way of fixing this payload, but our environment is memory constrained. The file is 19MB, our constraint is 50MB. I presume that something to do with the regex's/parse is making yet another copy of the json in memory, in addition to the parsing that axios is trying to do.

We are using a wrapper around axios so interfacing directly with it is limited. I'd have to reimplement parts of the wrapper to get to it and it's an external internal library.

I know the key's we're trying to keep, so just discarding the rest of the structure instead of dealing with the NaN is actually preferable.

The structure we want, looks like DataUsages[]

export interface DataUsages {
    dataUsageId: string;
    dataUsageName: string;
}

The structure we're getting has additional items in the objects in the array, we don't care about "dataUsageDownstreamUsages" at all, and that key could and is being discarded`

[
  {
    "dataUsageId": "42",
    "dataUsageName": "myname",
    "dataUsageDownstreamUsages": [NaN]
  }
]

current

This is our current method

      const reg1 = /\[NaN]|NaN/gm
      const parsed: EDMDataUsages[] = JSON.parse(
        requireNonNullish(response.body, 'body').replace(reg1, '""').replace(/NAN/gm, ''),
      )   

stream-json

Right now I'm looking at using stream-json for this.

If I use parser it bails, presumably when it reach's a NaN. So I'm looking at the disassembler but I don't understand how to do it with that.

  const read = new Readable()
  read.push(requireNonNullish(response.body, 'body'))
  read.push(null)

  const pipeline = chain([read, disassembler(), pick({ filter: 'data' }), data => this.log.trace('data', data)])
  pipeline.on('data', data => this.log.trace('filter', data))

obviously this code is not complete.

Other libraries are acceptable. Please provide a complete example.

xenoterracide
  • 16,274
  • 24
  • 118
  • 243

1 Answers1

0

ok first of all lets create a fake data

function genreateFakeData() {
    const payload = { res: [] }
    const mock = {
        "dataUsageId": "42",
        "dataUsageName": "myname",
        "dataUsageDownstreamUsages": [NaN]
    }
    let totalFileSize = 0;
    const fileSize = 19;
    const SIZE_IN_BYTES = JSON.stringify(mock).length;
    const SIZE_IN_MB = SIZE_IN_BYTES / 1024 / 1024;

    while (totalFileSize < fileSize) {
        totalFileSize += SIZE_IN_MB;
        payload.res.push(mock);
    }
    fs.appendFileSync('fake.json', JSON.stringify(payload));
}

then lets test this is the right size of the file

> ls -ltrh | grep fake
-rw-r--r--   1 naor.tedgi  staff    19M Jun 15 09:34 fake.json

let's run the node app under the heap constraint and mutate the original response

node --max-old-space-size=50 index.js

index.js

const stream = require('stream');
const fs = require('fs');
const data = JSON.parse(fs.readFileSync('fake.json',{encoding:'utf8', flag:'r'}));
data.res.forEach(entry=>{
    delete entry.dataUsageDownstreamUsages
})
console.log(data)
Naor Tedgi
  • 5,204
  • 3
  • 21
  • 48