0

I have a C++ application with an embedded V8 engine and I want to use V8 to transform data flexibly using Javascript. The amount of data is potentially large and comes from different file formats, so it is processed one record at a time. How can I make the data available to V8 one record at a time?

The 2 options that I'm thinking about would be making the C++ record stream available via Accessors as javascript objects - but I don't know how to return an ArrayBuffer from a C++ object.

The other option that I was thinking about was creating with each new record a new ArrayBuffer using the v8-API and binding it to the same global variable - so that the scripts can access it.

What would be the best / most performant way to stream the data in / out?

Tobias Langner
  • 10,634
  • 6
  • 46
  • 76

1 Answers1

0

It really depends, there are so many options...

You could have a single, long-lived (Shared)ArrayBuffer shared between JavaScript and your embedding, plus some notification mechanism. That way you could even get concurrency: the JavaScript code could run in one thread, the rest of the embedder in another, and you can use Atomics to signal "please look at array indices x through y now", "okay, results in x through y are ready", etc. Under certain assumptions at least, that may well turn out to be the highest-performance approach.

You could also factor your JavaScript code such that you simply have a function that's called once for each record and returns the result. That would probably be the simplest approach, and might well be fast enough.

jmrk
  • 34,271
  • 7
  • 59
  • 74
  • the second approach (called once / record) was the one that I'm aiming at - the question is how do I get the records one after another into V8? – Tobias Langner Apr 29 '20 at 12:52
  • When the JavaScript code defines a `function foo(x, y, z) {...}`, you can retrieve that function from your embedding code (read the global object's property named `foo`) and then call it with the appropriate arguments. You could use a global variable to write into from the outside, but since you have to call the function anyway, passing the data as an argument to the call seems like the cleanest design. The "process" sample does something similar: https://chromium.googlesource.com/v8/v8/+/master/samples/process.cc – jmrk Apr 29 '20 at 17:44