3

I have a server request which may return a huge json list (~100K records, ~50 Mb) of points I have to draw on a canvas using D3js. I'd like to draw them as they arrive in order to favor interactivity and spare memory, So :

I enabled Chunked transfer encoding on the server side + I tried this on the client side:

d3.json('?json=qDefects&operationid=' + opid) // my request 
  .on("load", function (json) {
    draw(json); // this works, but only after a long delay that I'd avoid...
  })
  .on("progress", function (json) {
    draw(json); // but this fails : json is not yet available here
  })
  .get();

Is it possible to handle the JSON in chunks as they are loaded ? Would it help to structure the JSON data differently ? Currently it is a single array so I have

[{"x":1, "y":2},{"x":2, "y":3}, // chunk 1
...
{"x":6845, "y":239426},{"x":51235, "y":234762}] // last chunk

would it help to divide the points in smaller arrays ?

Gerardo Furtado
  • 100,839
  • 9
  • 121
  • 171
Dr. Goulu
  • 580
  • 7
  • 21

3 Answers3

2

See the included fiddle: http://jsfiddle.net/Q5Jag/12412/

While the previous answer is correct in that you can't modify the progress event, you can do the simple thing of calling an external variable. So the following code will allow you to reprocess the string and send it to d3

var x = ''
d3.json("https://api.myjson.com/bins/1d7yoi")
  .on("progress", function(d) {
    x = d.responseText
    x = "ehllo" + x;
    console.log(x)
  })
  .on("load", function() {
    console.log("done")
  })
  .get()

You can assign responseText to the variable x and operate on x however you wish.

Eric Yang
  • 2,678
  • 1
  • 12
  • 18
  • You are just associating a global to the returned value of an asynchronous function, and that's definitely an anti pattern. Besides that, you're not manipulating the data: you are just making the same change to the string (not to any object) several times a second. – Gerardo Furtado Jul 10 '18 at 23:13
  • That example was to illustrate that you can operate on the value that is emitted by the progress handler. I could have easily taken an partial invalid json string, "repaired" it, and sent it off to d3 for plotting. As for whether it's a global variable, I can also wrap the thing in a function so it's not polluting the global scope – Eric Yang Jul 11 '18 at 01:22
  • You don't need to "manipulate" the response object in the way that the OP's use case. What you want is to take a partial response and send it off for plotting, which my answer does do. – Eric Yang Jul 11 '18 at 01:23
  • No, it doesn't. First, `d3.json` callback has to wait for the XHR to be completed. Second, that's a string, which is loaded several times a second: you'll have to parse the (same) growing string again and again and send it to *another* function, not to the `d3.json` callback. In the end, it will take way more time. Just set up a D3 code trying your solution and you'll see by yourself. – Gerardo Furtado Jul 11 '18 at 01:29
  • While everything you say is correct, I think it misses the point. It's the difference between say total render time vs. first point visible time. I was targeting the shortest time that the first data point can be rendered. Yes, you can't rely on the call back, but I see nothing wrong with calling another rendering function. As for parsing the same growing string over and over again, I can just as easily maintain a variable for the length of the previous string, truncate appropriately, fix the string so it's proper json and go on my merry way – Eric Yang Jul 11 '18 at 01:43
0

tl;dr: you cannot manipulate the JSON using the progress event.


First of all, you are probably using d3.request (D3 v3 and v4), not d3.fetch (D3 v5). That's an important difference because in both microlibraries the method has the same name, which is d3.json. However, d3.json is a XMLHttpRequest in the former, while it is a Promise in the latter.

Second, which is the most important, this seems to be (unfortunately) an XY problem. You said "I'd like to draw them as they arrive in order to favor interactivity and spare memory", but the problem is that you can't: even if you could manipulate the data while it arrives (and you can't, see below) D3 will only start drawing anything after the XHR (or the Promise) finishes downloading the data. That means that, with 50MB of data, the user will stare at a blank page for several seconds... So, the best advice here is re-thinking the size of the data file and the whole datavis.

Back to the question:

The progress event is used just to monitor the progress. According to the W3 Consortium:

This specification defines an event interface — ProgressEvent — that can be used for measuring progress. (emphasis mine)

We can check this in the following demo (I'm using an array with the objects you shared in your question, I just copy/pasted the same objects several times). We can use srcElement.response to see the loaded JSON, but we cannot change it:

d3.json("https://api.myjson.com/bins/1d7yoi")
  .on("progress", function(d) {
    console.log(d.srcElement.response)
  })
  .on("load", function() {
    console.log("done")
  })
  .get()
<script src="https://d3js.org/d3.v4.min.js"></script>

For instance, you can see that in this silly attempt to change anything in the string, nothing is changed:

d3.json("https://api.myjson.com/bins/1d7yoi")
  .on("progress", function(d) {
    d.srcElement.response[0] = "foo";
    console.log("First character is: " + d.srcElement.response[0])
  })
  .on("load", function(data) {
    console.log("JSON:" + JSON.stringify(data))
  })
  .get()
<script src="https://d3js.org/d3.v4.min.js"></script>
Gerardo Furtado
  • 100,839
  • 9
  • 121
  • 171
0

Thanks to the previous answers, I ended up with this:

function progressLoad(f) {
  let start = 0;
  return function (event) {
    let str = event.responseText.substr(start); 
    let i = str.indexOf("{");                  
    let j = str.lastIndexOf("}");
    str = "[" + str.substr(i, j) + "]";
    let data = JSON.parse(str);
    f(data);
    start = start + j + 1;
  }
}

d3.json('?json=qDefects&operationid=' + opid)
  .on("progress", progressLoad(draw));

It works pretty well in my (simple) situation where I have no nested {}. However I also made sure my server provides chunks corresponding to each record of my request, and it looks the responseText is incremented by these chunks also, so I always have matching {} in str.

Of course this still builds a very long, useless responseText, and possibly a useless final json parse (even if I have no "loaded" event ?) , but I can cope with this for now.

Dr. Goulu
  • 580
  • 7
  • 21