24

I am writing priority queues and octrees in the asm.js subset of Javascript in order to squeeze the last possible performance out of them.

However, how do you store references to Javascript objects in the asm.js function's heap buffer?

Right now, my structs in the heap have to have an integer ID for the Javascript object they are referencing, and I need a classic Javascript object to act as a dict between these ints and the Javascript objects.

For example, I have an asm.js octree with which exposes an add function like add(x1,y1,z1,x2,y2,z2,object_id) where object_id is integer. And the find(x1,y1,z1,x2,y2,z2) function returns a list of all the object_ids that are within the bounds. This means I have to maintain a dictionary of objects to object_ids in Javascript so I can determine the actual objects that are in that box; the mapping of object_ids to objects.

This feels wrong. The idea of casting an int to a string to do a lookup in the Javascript world is just wrong. One key point of writing inner-loop data-structures in asm.js is to avoid creating garbage.

(I am targeting Chrome as much as Firefox; I am hoping that asm.js strict code runs faster on both. Yes I will be profiling.)

No matter how many properties you can materialise into the asm.js heap - an object's position and dimensions, for example - you usually need to associate some Javascript objects with the item too; strings and webGL objects and DOM objects and so on.

Is there a better way for asm.js heap to contain pointers to Javascript objects? And if using integer ID mappings, is it better it use arrays or objects-as-dictionaries, for example?

Esoteric Screen Name
  • 6,082
  • 4
  • 29
  • 38
Will
  • 73,905
  • 40
  • 169
  • 246
  • 1
    Would you post some code please? My brain works better when it's given some tangible food for thought. – Aadit M Shah Jul 13 '13 at 01:36
  • @AaditMShah code added – Will Jul 14 '13 at 09:14
  • 1
    Awesome, but I'll need a little more than that. Perhaps you could link me to a gist or a fiddle showing your complete code? I have a feeling you can solve this problem by emulating [Self-like](http://selflanguage.org/ "Welcome to Self — Self - the power of simplicity") objects (i.e. objects with message passing for getting/setting values). However I need to know the structure of your code to suggest how to implement it, including how you use the Foreign Function Interface (FFI) to interact with JavaScript proper. – Aadit M Shah Jul 14 '13 at 12:14
  • downvote: Your question is interesting, but the explanation is difficult to understand and a bit of code would have been useful. – Adrian Maire Jul 18 '13 at 12:49

4 Answers4

15

After reading the asm.js specs several times over and experimenting with it in Firefox I agree with bks:

an asm.js program can only interact indirectly with external data via numeric handles

However this doesn't pose a major problem. Since asm.js is a subset of JavaScript you will not be able to use a lot of JavaScript constructs in asm.js, including:

  1. JavaScript Objects
  2. Dynamic Arrays
  3. Higher Order Functions

Nevertheless, asm.js does provide a way to call JavaScript functions using the Foreign Function Interface (FFI). This is a very powerful mechanism as it allows you to callback JavaScript from asm.js (allowing you to create procedures partially written in asm.js and partially written in JavaScript).

It's important to distinguish which parts of your code can be converted to asm.js and will benefit from using asm.js. For example asm.js is great for graphics processing since its requires a lot of calculations. However it's not suitable for string manipulation. Plain JavaScript would be better for that purpose.

Getting back to the topic, the problem you're facing is that you need to reference JavaScript objects from within asm.js code. Since the only way to do this is to use numeric handles (which you don't want), there's only one other solution I see:

Instead of referencing JavaScript objects from within asm.js, reference asm.js structures from within JavaScript.

There are many reasons why this method is better:

  1. Since JavaScript is a superset of asm.js you can already use asm.js structures in JavaScript as is.
  2. Since JavaScript is more powerful than asm.js it's easier to make asm.js structures behave like JavaScript objects.
  3. By importing asm.js structures into JavaScript your asm.js code becomes simpler, more cohesive and less tightly coupled.

Enough of talk, let's see an example. Let's take Dijkstra's shortest path algorithm. Luckily I already have a working demo (I had to implement Dijkstra's algorithm for a college assignment):

http://jsfiddle.net/3fsMn/

The code linked to above is fully implemented in plain old JavaScript. Let's take some parts of this code and convert it to asm.js (keeping in mind that the data structures will be implemented in asm.js and then exported to JavaScript).

To start with something concrete, this is the way I'm creating a graph in JavaScript:

var graph = new Graph(6)
    .addEdge(0, 1, 7)
    .addEdge(0, 2, 9)
    .addEdge(0, 3, 14)
    .addEdge(1, 2, 10)
    .addEdge(1, 4, 15)
    .addEdge(2, 3, 2)
    .addEdge(2, 4, 11)
    .addEdge(3, 5, 9)
    .addEdge(4, 5, 6);

We want to keep the same interface. Hence the first thing to modify is the Graph constructor. This is how it's currently implemented:

function Graph(v) {
    this.v = --v;

    var vertices = new Array(v);

    for (var i = 0, e; e = v - i; i++) {
        var edges = new Array(e);
        for (var j = 0; j < e; j++)
            edges[j] = Infinity;
        vertices[i] = edges;
    }

    this.vertices = vertices;
}

I won't bother explaining all the code in depth, but a general understanding is required:

  1. The first thing to note is that suppose I'm creating a graph consisting of 4 vertices, then I only create an array of 3 vertices. The last vertex is not required.
  2. Next, for each vertex I create a new array (representing the edges) between two vertices. For a graph with 4 vertices:
    1. The first vertex has 3 edges.
    2. The second vertex has 2 new edges.
    3. The third vertex has 1 new edge.
    4. The fourth vertex has 0 new edges (which is the reason we only need an array of 3 vertices).

In general a graph of n vertices has n * (n - 1) / 2 edges. So we can represent the graph in a tabular format as follows (the table below is for the graph in the demo above):

+-----+-----+-----+-----+-----+-----+
|     |  f  |  e  |  d  |  c  |  b  |
+-----+-----+-----+-----+-----+-----+
|  a  |     |     |  14 |  9  |  7  |
+-----+-----+-----+-----+-----+-----+
|  b  |     |  15 |     |  10 |
+-----+-----+-----+-----+-----+
|  c  |     |  11 |  2  |
+-----+-----+-----+-----+
|  d  |  9  |     |
+-----+-----+-----+
|  e  |  6  |
+-----+-----+

This is the data structure we need to implement in the asm.js module. Now that we know what it looks like let's get down to implementing it:

var Graph = (function (constant) {
    function Graph(stdlib, foreign, heap) { /* asm.js module implementation */ }

    return function (v) {
        this.v = --v;
        var heap = new ArrayBuffer(4096);
        var doubleArray = this.doubleArray = new Float62Array(heap);
        var graph = this.graph = Graph(window, {}, heap);
        graph.init(v);

        var vertices = { length: v };

        for (var i = 0, index = 0, e; e = v - i; i++) {
            var edges = { length: e };

            for (var j = 0; j < e; j++) Object.defineProperty(edges, j, {
                get: element(index++)
            });

            Object.defineProperty(vertices, i, {
                get: constant(edges)
            });
        }

        this.vertices = vertices;

        function element(i) {
            return function () {
                return doubleArray[i];
            };
        }
    };
}(constant));

As you can see our Graph constructor has become a lot more complicated. In addition to v and vertices we have two new public properties, doubleArray and graph, which are required to expose the data structure and the data operations from the asm.js module respectively.

The vertices property is particular is now implemented as an object instead of an array, and it uses getters to expose the asm.js data structure. This is how we reference asm.js data structures from within JavaScript.

The heap is simply an ArrayBuffer and it can be operated on by either asm.js code or plain old JavaScript. This allows you to share data structures between asm.js code and JavaScript. On the JavaScript side you can wrap up this data structure in an object and use getters and setters to dynamically update the heap. In my humble opinion this is better than using numeric handles.

Conclusion: Since I've already answered your question and demonstrated how to import asm.js data structures into JavaScript I would conclude that this answer is complete. Nevertheless I would like to leave behind a working demo as a proof of concept. However this answer is already getting too big. I'll write a blog post on this topic and post a link to it here as soon as possible.

JSFiddle for Dijkstra's shortest algorithm path algorithm implemented in asm.js coming up soon.

Aadit M Shah
  • 72,912
  • 30
  • 168
  • 299
  • This is a very nice answer. But you can't always invert who stores what. None of the example data structures given in the question seem amendable to being so inverted, it seems? – Will Jul 20 '13 at 06:51
  • @Will I'm still learning asm.js so I can't really comment on whether every data structure can be implemented in this way, but I yet have to see a data structure which can't be inverted. Hence there's still a possibility that the data structures mentioned in the question may be invertible. However I would need to see the actual code to come to a definitive conclusion. – Aadit M Shah Jul 20 '13 at 07:24
  • Imagine that the objects in your octree have VBOs. How would you store that object reference in your asm.js heap? – Will Jul 20 '13 at 18:55
  • @Will I have to brush up on my octrees and VBOs. Post some code perhaps? That would go a long way in helping me understand your problem. – Aadit M Shah Jul 21 '13 at 02:59
  • The problem is how to most efficiently reference Javascript objects from inside asm.js functions. – Will Jul 21 '13 at 21:44
  • @AaditMShah Are you gonna post that link? – trusktr Jan 06 '16 at 08:14
  • 1
    @trusktr It's been a while since I wrote a blog post. I'll start writing again and I'll start with this proof of concept and post a link to it. Give me one week. – Aadit M Shah Jan 06 '16 at 14:40
  • @trusktr Unfortunately, not yet. I've been at [POPL 2016](http://conf.researchr.org/home/POPL-2016) all week and I didn't get much time to write anything. – Aadit M Shah Jan 22 '16 at 17:50
  • Would be sweeeeeet to see that demo!! :D – trusktr May 10 '17 at 23:33
9

As I read the asm.js spec at http://asmjs.org/spec/latest/ and the FAQ at http://asmjs.org/faq.html, the short answer is that you can't store JS object references in the asmjs heap. Quoting from the FAQ:

Q. Can asm.js serve as a VM for managed languages, like the JVM or CLR?

A. Right now, asm.js has no direct access to garbage-collected data; an asm.js program can only interact indirectly with external data via numeric handles. In future versions we intend to introduce garbage collection and structured data based on the ES6 structured binary data API, which will make asm.js an even better target for managed languages.

So your current method of storing an external id-to-object map seems to be the current recommended way to solve your problem, as long as you care about the object instances rather than just their contents. Otherwise, I think the idea is that you dematerialize the stored objects: store the complete contents of each object at its slot in the priority queue, and turn it back into a true JS object only when it is fetched. But that only works if your objects are safe to recreate on demand.

Community
  • 1
  • 1
bks
  • 1,360
  • 6
  • 7
  • So is using integer indices as object properties the best way to do the mapping? How does the performance of arrays compare? – Will Jul 14 '13 at 19:58
  • @Will I think Arrays also use strings as object properties. For example, `array[0]` is the same as `array["0"]`, and it converts the integer into a string to access the array item. It's exactly like an Object (Array extends from Object). I don't think using an Array can get you around that unless the JS engine implementation optimizes for that, but that isn't something that is spec (that I know of). – trusktr May 10 '17 at 23:29
4

This feels wrong. The idea of casting an int to a string to do a lookup in the Javascript world is just wrong. One key point of writing inner-loop data-structures in asm.js is to avoid creating garbage.

There is no need to cast an int to a string here. You should have a JS array that maps indexes to JS objects, then indexing it with an integer should be optimized in JS engines to be a direct use of that integer. They will know when the lookup table is an array, and when the values flowing in are integers.

This is how emscripten (in both asm.js output mode and non-asm.js output mode) handles things like function pointers. You have an integer ID, and there is a JS array mapping those IDs to the relevant objects. For example,

var FUNCTION_TABLE = [function zero() {}, function one() {}];

later called with

FUNCTION_TABLE[i]();

It is important to keep the array properly optimized, which basically means starting its values at 0 and not having holes. Otherwise, it can be implemented as a dictionary instead of a fast flat list.

Alon Zakai
  • 1,038
  • 5
  • 4
  • You said "indexing it with an integer should be optimized in JS engines", but I don't think that is guaranteed in all engines. Do you know which engines actually make that optimization? – trusktr May 10 '17 at 23:34
0

probably i haven't fully understand your question, but it may be possible to use the priority queue from the C++ standard lib, and then compile it with emscripten to create a asm.js javascript module.

For example the followinf code:

#include <queue>
#include <iostream>

class MyClass {
    private:
        int priority;
        int someData;
    public:
        MyClass():priority(0), someData(0){}
        MyClass(int priority, int data):priority(priority), someData(data){}
        int getPriority() const { return this->priority;}
        int getData() const { return this->someData;}
        void setData(int data){ this->someData = data;}
        inline bool operator<(const MyClass & other) const{
            return this->getPriority() < other.getPriority();
        }
};

int main(){

    std::priority_queue<MyClass> q;
    q.push(MyClass(50, 500));
    q.push(MyClass(25, 250));
    q.push(MyClass(75, 750));
    q.push(MyClass(10, 100));

    std::cout << "Popping elements: " << std::endl;
    while(!q.empty()){
        std::cout << q.top().getData() << std::endl;
        q.pop();
    }
    std::cout << "Queue empty" << std::endl;

    return 0;
};

Compiled like:

emcc queue.cpp -s ASM_JS=1 -O2 -o queue.js

May be then executed with nodejs, producing the following output:

$ nodejs queue.js 
Popping elements: 
750
500
250
100
Queue empty

It can also be compiled to create a html file and load this in the browser like:

$ emcc queue.cpp -s ASM_JS=1 -O2 -o queue.html

Don't know if this is an option for you, but writing asmjs code manually is quite complex.

Javier Mr
  • 2,130
  • 4
  • 31
  • 39
  • How would this contain Javascript objects - rather than ints - in the priority queue? – Will Jul 12 '13 at 21:55
  • rather than designing JS objects in pure javascript design them as C++ classes (it also works with C structures). In the above code the priority queue is ordering instances of the class `MyClass`, not just ints and the ordering function is the default in C++ the `less`, that's why I have overload the < operator with the function `operator<`, the data in current implementation contains just two ints, but it could be as complex as a C++ class. – Javier Mr Jul 12 '13 at 23:19
  • I'm not trying to port a C++ game. I'm trying to speed up a Javascript game by putting the hottest datastructures into asm.js. But ultimately I still want my octree to contain things that exist in the Javascript world e.g. GameUnit types. – Will Jul 13 '13 at 13:51
  • Hi, in this two links you can find more information about how that can be done, and as I understand the main purpose for asmjs is exactly that. Here is an example on how to [interact with asm code from regular javascript](https://github.com/kripken/emscripten/wiki/Interacting-with-code) and [a simpler example](http://kripken.github.io/mloc_emscripten_talk/qcon.html#/20). But, as seen on the second link, asmjs is intended as a compile target, not to be hand written. Any way, I don't think you can store a regular JS object in the asmjs heap, because asmjs has to know the type of the fields. – Javier Mr Jul 13 '13 at 15:35
  • @Will please provide some code so that I may better understand what you're trying to do. It's really not so difficult. Simply copy and paste. BTW you are allowed to edit SO questions. – Aadit M Shah Jul 14 '13 at 02:01
  • dont know why you got downvoted. The purpose of ASM.JS is not to be written directly in Javascript. It's to get C/C++ compiled into efficient javascript code. Hand coding ASM.JS is like writing pure ASM. – mpm Jul 14 '13 at 17:18
  • @mpm Exactly!. Will: The objects that you want to store, are they mutable or not?, I mean, do they have always the same attributes (fields of the JS object map) or they can have an unknown number/names of files. If they are fixed, those object could be implemented in C++ classes and no need to have 'references'. – Javier Mr Jul 17 '13 at 09:00
  • I have no trouble writing asm.js by hand. The question is solely relating to how to store references to javascript objects in the asm.js heap. No matter how much of the attributes of the objects you can store in the asm.js heap rather than in javascript objects, there are always things you can't e.g. strings and webgl and dom references. It seems that the integer ID mapping is the only known way forward here - see other answers - and now the question is how to most efficiently do that mapping - dictionaries or arrays? – Will Jul 19 '13 at 11:55
  • I think another interesting point is how should be the memory managed, I mean, if an integer indexed array is used you should know when deleting an object where the 'hole' in the array is so it can be used in the next insertion, otherwise the array would grow for ever. I **haven't done** any performance testing but I would go with the array for performance and for the map for easy management. – Javier Mr Jul 19 '13 at 14:12