0

We are using Titan 1.0.0 with Cassandra version 3.9.0-1, from datastax-ddc on CentOs-7 system. We are seeing some strange issues, like:

  • Multiple edges with the same edge id, few properties on these edge do differ in values.

    g.V().has('msid', 6171699).outE('prio_child').has('hostid_e',153).as('e')inV().has('msid',58713376).select('e') ==>e[54ekdatm-1lezwb4-45cl-195s9km8][3471761488-prio_child->98305011872] ==>e[54ekdatm-1lezwb4-45cl-195s9km8][3471761488-prio_child->98305011872]

  • Getting more results, after applying more restrictions

    g.V().has('msid', 6171699).outE('prio_child').count()
    ==>60

    g.V().has('msid', 6171699).outE('prio_child').has('hostid_e, 153).count()
    ==>66

I have even tried setting ConsistencyModifier.LOCK, as suggested by Titan Documentation Eventually Consistent Backends, but it has not helped. I am still getting arbitrary results.

Misha Brukman
  • 12,938
  • 4
  • 61
  • 78
Pankaj Yadav
  • 139
  • 1
  • 10

2 Answers2

6

Titan 1.0.0 is not compatible with Cassandra 3.x http://s3.thinkaurelius.com/docs/titan/1.0.0/version-compat.html

Titan is also no longer being maintained. JanusGraph http://janusgraph.org/ has taken up where Titan left off and is being updated and maintained actively.

pantalohnes
  • 961
  • 5
  • 12
0

I was able to reproduce and fix it, by following Data Consistency. I was missing the following the command after setting the ConsistencyModifier.

mgmt.commit()

Following is the piece of code which reproduced the problem, with both the versions of cassandra i.e. cassandra 2.1.x and cassandra 3.9.x.


    TitanGraph graph = TitanFactory.open("/opt/cmsgraph/config/edgepoc.conf");
    try {
        int parent = -2128958273;
        int child = 58541705;
        int hostid = 83;
        int numThreads = 100;
        Thread[] threads = new Thread[numThreads];
        for(int i =0; i < numThreads; i++) {
            threads[i] = new Thread(new EdgeUpdator(graph, parent, child, hostid));
        }
        for(int i =0; i < numThreads; i++) {
            threads[i].start();
        }
        for(int i = 0; i < numThreads; i++) {
            threads[i].join();
        }
    } finally {
        graph.close();
    }

    private static class EdgeUpdator implements Runnable {
        public EdgeUpdator(TitanGraph graph, int parent, int child, int hostid) {
            this.graph = graph;
            this.parent = parent;
            this.child = child;
            this.hostid = hostid;
        }

        private int parent;
        private int child;
        private int hostid;
        private TitanGraph graph;

        public void run() {
            TitanTransaction trxn = graph.newTransaction();
            GraphTraversalSource g = trxn.traversal();
            Edge edge = (Edge)g.V().has("msid", parent).outE("prio_child").has("hostid_e", hostid).as("e").inV().has("msid", child).select("e").next();
            Random random = new Random(System.nanoTime());
            edge.property("updatedAt_e", random.nextLong());
            edge.property("plrank", random.nextInt());
            trxn.commit();
        }
    }

Before executing the the above code. I see:


    gremlin> g.V().has('msid', -2128958273).outE('prio_child').has('hostid_e', 83).as('e').inV().has('msid', 58541705).select('e')
    ==>e[239suvpz-17ofqw-41ed-9eutzq8][73363640-prio_child->20489355296]
    gremlin> g.V().has('msid', -2128958273).outE('prio_child').has('hostid_e', 83).as('e').inV().has('msid', 58541705).select('e').count()
    ==>1
    gremlin> g.V().has('msid', -2128958273).outE('prio_child').has('hostid_e', 83).count()
    ==>104

After executing the code, I see:


    gremlin> g.V().has('msid', -2128958273).outE('prio_child').has('hostid_e', 83).as('e').inV().has('msid', 58541705).select('e')
    ==>e[239suvpz-17ofqw-41ed-9eutzq8][73363640-prio_child->20489355296]
    ==>e[239suvpz-17ofqw-41ed-9eutzq8][73363640-prio_child->20489355296]
    ==>e[239suvpz-17ofqw-41ed-9eutzq8][73363640-prio_child->20489355296]
    ==>e[239suvpz-17ofqw-41ed-9eutzq8][73363640-prio_child->20489355296]
    ==>e[239suvpz-17ofqw-41ed-9eutzq8][73363640-prio_child->20489355296]
    ==>e[239suvpz-17ofqw-41ed-9eutzq8][73363640-prio_child->20489355296]
    ==>e[239suvpz-17ofqw-41ed-9eutzq8][73363640-prio_child->20489355296]
    ==>e[239suvpz-17ofqw-41ed-9eutzq8][73363640-prio_child->20489355296]
    ==>e[239suvpz-17ofqw-41ed-9eutzq8][73363640-prio_child->20489355296]
    ==>e[239suvpz-17ofqw-41ed-9eutzq8][73363640-prio_child->20489355296]
    gremlin> g.V().has('msid', -2128958273).outE('prio_child').has('hostid_e', 83).as('e').inV().has('msid', 58541705).select('e').count()
    ==>10
    gremlin> g.V().has('msid', -2128958273).outE('prio_child').has('hostid_e', 83).as('e').inV().has('msid', 58541705).select('e').dedup().count()
    ==>1
    gremlin> g.V().has('msid', -2128958273).outE('prio_child').has('hostid_e', 83).count()
    ==>113
    gremlin> g.V().has('msid', -2128958273).outE('prio_child').count()
    ==>104

After applying the ConsitencyModifier.LOCK to "prio_child" edge, I observed that 9 of the 10 threads failed with following exception and, I it didn't result in any multiple edges with same edge id problem.

Exception in thread "Thread-8" org.apache.tinkerpop.gremlin.process.traversal.util.FastNoSuchElementException

Following is the exact changes I made:


    mgmt = graph.openManagement()
    prio_child=mgmt.getRelationType('prio_child')
    mgmt.setConsistency(prio_child, ConsistencyModifier.LOCK)
    mgmt.commit()

Pankaj Yadav
  • 139
  • 1
  • 10
  • Even this has not helped and the problem still exists. The **ConsistencyModifier.LOCK** is **NOT** working as expected. Basically after digging down the issue further, I found that it's the case of VertexCentric Index corruption. After dropping all the custom vertex centric (Titan also creates a default vertex centric index for all the edge properties automatically - which work fine), this problem was resolved. – Pankaj Yadav Jun 02 '17 at 06:05
  • Opened an [issue](https://github.com/JanusGraph/janusgraph/issues/301) with JanusGraph. Basically the VertexCentric indices are getting corrupt here, If the index doesn't come into picture correct results are being returned. – Pankaj Yadav Jun 07 '17 at 13:18