3

I'm using the gremlin javascript client (3.4.2) to query a janusgraph through gremlin server. I noticed that after having the server running for a while while developing, some of the requests that query the graph start to stay on pending for a lot (the amount of timeout that is set on gremlin-server).

Looking at the server console I can see this message:

 Pausing response writing as writeBufferHighWaterMark exceeded on RequestMessage{, requestId=9697c61a-34de-4764-a8c4-72d7f7a154ac, op='bytecode', processor='traversal', args={gremlin=[[], [V(), has(User, identityid, NQ05cGsB3uhLm-BIAGPq), outE(worked), choose([[], [values(dateto)]], [[], [project(vertexId, role, businessId, business, relationId, dateinsert, datefrom, dateto), by([[], [inV(), hasLabel(Job), id()]]), by([[], [inV(), hasLabel(Job), values(role)]]), by([[], [inV(), hasLabel(Job), outE(), hasLabel(at), inV(), id()]]), by([[], [inV(), hasLabel(Job), outE(), hasLabel(at), inV(), values(name)]]), by(id), by(dateinsert), by(datefrom), by(dateto)]], [[], [project(vertexId, role, businessId, business, relationId, dateinsert, datefrom), by([[], [inV(), hasLabel(Job), id()]]), by([[], [inV(), hasLabel(Job), values(role)]]), by([[], [inV(), hasLabel(Job), outE(), hasLabel(at), inV(), id()]]), by([[], [inV(), hasLabel(Job), outE(), hasLabel(at), inV(), values(name)]]), by(id), by(dateinsert), by(datefrom)]])]], aliases={g=refeenet}}} - writing will continue once client has caught up

And then after a while appears the timeout WARNING.

I'm not sure why this happens and I didn't find anything usefull looking for a solution.

I'm using typescript.

This is the class that manages the traversal source:

import gremlin from "gremlin";

const GREMLIN_URL = "ws://localhost:8182/gremlin";
const GRAPH_NAME = "maingraph";

const { Graph } = gremlin.structure;
const { DriverRemoteConnection } = gremlin.driver;

export class GremlinApi {

    public static g: gremlin.process.GraphTraversalSource;
    public static connection: gremlin.driver.DriverRemoteConnection;

    public static getTraversalSource() {
        if (!GremlinApi.g) {
            const graph = new Graph();
            GremlinApi.connection = new DriverRemoteConnection(GREMLIN_URL, { traversalSource: GRAPH_NAME });
            GremlinApi.g = graph.traversal().withRemote(GremlinApi.connection);
        }
        return GremlinApi.g;
    }

    public static closeTraversal() {
        if (GremlinApi.connection && GremlinApi.connection.close) {
            GremlinApi.connection.close();
            GremlinApi.connection = null;
            GremlinApi.g = null;
        }
    }

}

And this is an example of how I usually use the traversal:


import { GremlinApi } from "../db/gremlinApi/gremlinApi";

// Get the traversal
const g = GremlinApi.getTraversalSource();
// Do something with the traversal
GremlinApi.closeTraversal();

This usually happens there are something like 3/4 queries really close to each other. Some of those goes in timeout.

Any idea on what could be causing this problem?

Pistacchio
  • 445
  • 4
  • 15
  • I now tried to space the queries with 200ms of timeout and it works. Could it be that I'm using the traversal in the wrong way? Should I not have that static traversal source and make a new connection for each query? – Pistacchio Jun 19 '19 at 17:52

1 Answers1

3

That message generally means that the client isn't keeping up with what the server is trying to write, so the server pauses it's response until the client catches up with what is in the buffer. I suppose it could be a slow client causing the trouble or perhaps, network issues, but in any case the point of pausing is to avoid out of memory errors on the server by continuing to buffer results in memory. Given that context I guess you could see why spacing your queries helps as you alluded to in the comment to your question.

You can fine tune the "watermarks" in the Gremlin Server yaml file - writeBufferHighWaterMark and writeBufferLowWaterMark. You could also look at the traversal you're sending. You don't specify what those are, but you should look to see if you can slim down or eliminate results being returned. I typically see users getting this issue on data loads and many times it can eliminated by simply ignoring the return result by calling iterate() at the end of the traversal rather than next(),toList(), etc which require return of an actual result which they aren't using on the client side anyway.

stephen mallette
  • 45,298
  • 5
  • 67
  • 135