3

We are using Titan Distributed Graph Database in one of our services. We have a working configuration with CassandraDB as a storage backend.

Now we are working on running the service with AWS DynamoDB storage backend. If we run it in local mode using https://github.com/awslabs/dynamodb-titan-storage-backend everything works fine. However, after moving to AWS DynamoDB in the cloud, we observe some indeterministic issues.

We have our unit tests which run on the empty db, generate tests data, perform some CRUD's, and then clean everything up. The tests pass on Cassandra. They pass on local DynamoDB as well. Unfortunately they tend to fail on local DynamoDB and on AWS DynamoDB in the cloud. Usually it takes a few runs before a failure occurs. After such a failure we can find some ghost vertices in the database. It looks like index is out-of-sync with the graph. In the result, we can see different results while querying the graph using the index and not using it.

There is also an exception in logs. The most interesting one is: com.amazon.titan.diskstorage.dynamodb.BackendRuntimeException: was interrupted during parallelMutate:


    13:19:12 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doGet()
    13:19:12 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doBegin()
    13:19:13 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doCommit()
    13:19:13 [ERROR] [c.h.c.c.h.HystrixDiagnosticLogger] [] HYSTRIX: command category-v1.createCategory failed (com.hybris.caas.category.command.CreateCategoryCommand@7a77b317); execution events: [TIMEOUT, FALLBACK_MISSING]
    java.util.concurrent.TimeoutException: null
        at com.netflix.hystrix.AbstractCommand$9.call(AbstractCommand.java:600) ~[hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.AbstractCommand$9.call(AbstractCommand.java:580) ~[hystrix-core-1.5.1.jar:1.5.1]
        at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$4.onError(OperatorOnErrorResumeNextViaFunction.java:139) ~[rxjava-1.1.1.jar:1.1.1]
        at rx.internal.operators.OperatorDoOnEach$1.onError(OperatorDoOnEach.java:71) [rxjava-1.1.1.jar:1.1.1]
        at rx.internal.operators.OperatorDoOnEach$1.onError(OperatorDoOnEach.java:71) [rxjava-1.1.1.jar:1.1.1]
        at com.netflix.hystrix.AbstractCommand$HystrixObservableTimeoutOperator$1.run(AbstractCommand.java:955) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.strategy.concurrency.HystrixContextRunnable$1.call(HystrixContextRunnable.java:41) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.strategy.concurrency.HystrixContextRunnable$1.call(HystrixContextRunnable.java:37) [hystrix-core-1.5.1.jar:1.5.1]
        at com.sap.cloud.yaas.servicesdk.logging.DelegatingMDCCallable.call(DelegatingMDCCallable.java:58) [service-sdk-logging-4.7.0.jar:4.7.0]
        at com.netflix.hystrix.strategy.concurrency.HystrixContextRunnable.run(HystrixContextRunnable.java:57) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.AbstractCommand$HystrixObservableTimeoutOperator$2.tick(AbstractCommand.java:972) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.util.HystrixTimer$1.run(HystrixTimer.java:99) [hystrix-core-1.5.1.jar:1.5.1]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_102]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_102]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_102]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_102]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_102]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_102]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102]
    13:19:13 [ERROR] [c.t.t.g.database.StandardTitanGraph] [] Could not commit transaction [9] due to storage exception in commit
    com.thinkaurelius.titan.core.TitanException: Could not execute operation due to backend exception
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:44) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction.persist(CacheTransaction.java:87) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction.flushInternal(CacheTransaction.java:141) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction.commit(CacheTransaction.java:198) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.BackendTransaction.commitStorage(BackendTransaction.java:118) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.commit(StandardTitanGraph.java:718) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.commit(StandardTitanTx.java:1352) [titan-core-1.0.0.jar:na]
        at com.hybris.caas.category.titan.TitanTransactionManager.doCommit(TitanTransactionManager.java:126) [classes/:na]
        at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:761) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:730) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:485) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:291) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) [spring-aop-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:208) [spring-aop-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at com.sun.proxy.$Proxy41.createCategory(Unknown Source) [na:na]
        at com.hybris.caas.category.command.CreateCategoryCommand.run(CreateCategoryCommand.java:81) [classes/:na]
        at com.hybris.caas.category.command.CreateCategoryCommand.run(CreateCategoryCommand.java:35) [classes/:na]
        at com.netflix.hystrix.HystrixCommand$1.call(HystrixCommand.java:293) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.HystrixCommand$1.call(HystrixCommand.java:288) [hystrix-core-1.5.1.jar:1.5.1]
        at rx.Observable$2.call(Observable.java:162) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:154) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:162) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:154) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:162) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:154) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable.unsafeSubscribe(Observable.java:8314) [rxjava-1.1.1.jar:1.1.1]
        at com.netflix.hystrix.AbstractCommand$5.call(AbstractCommand.java:521) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.AbstractCommand$5.call(AbstractCommand.java:497) [hystrix-core-1.5.1.jar:1.5.1]
        at rx.Observable.unsafeSubscribe(Observable.java:8314) [rxjava-1.1.1.jar:1.1.1]
        at rx.internal.operators.OperatorSubscribeOn$1.call(OperatorSubscribeOn.java:94) [rxjava-1.1.1.jar:1.1.1]
        at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:56) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:47) [hystrix-core-1.5.1.jar:1.5.1]
        at com.sap.cloud.yaas.servicesdk.logging.DelegatingMDCCallable.call(DelegatingMDCCallable.java:58) [service-sdk-logging-4.7.0.jar:4.7.0]
        at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction.call(HystrixContexSchedulerAction.java:69) [hystrix-core-1.5.1.jar:1.5.1]
        at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55) [rxjava-1.1.1.jar:1.1.1]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_102]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_102]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_102]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_102]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102]
    Caused by: com.thinkaurelius.titan.diskstorage.PermanentBackendException: Permanent exception while executing backend operation CacheMutation
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:42) ~[titan-core-1.0.0.jar:na]
        ... 40 common frames omitted
    Caused by: com.amazon.titan.diskstorage.dynamodb.BackendRuntimeException: was interrupted during parallelMutate
        at com.amazon.titan.diskstorage.dynamodb.DynamoDBDelegate.parallelMutate(DynamoDBDelegate.java:301) ~[dynamodb-titan100-storage-backend-1.0.0.jar:na]
        at com.amazon.titan.diskstorage.dynamodb.DynamoDBStoreManager.mutateMany(DynamoDBStoreManager.java:194) ~[dynamodb-titan100-storage-backend-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:90) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:87) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:56) ~[titan-core-1.0.0.jar:na]
        ... 41 common frames omitted
    13:19:13 [INFO ] [c.h.c.c.c.w.m.TimeoutExceptionMapper] [] Mapping TimeoutException
    java.util.concurrent.TimeoutException: null
        at com.netflix.hystrix.AbstractCommand$9.call(AbstractCommand.java:600) ~[hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.AbstractCommand$9.call(AbstractCommand.java:580) ~[hystrix-core-1.5.1.jar:1.5.1]
        at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$4.onError(OperatorOnErrorResumeNextViaFunction.java:139) ~[rxjava-1.1.1.jar:1.1.1]
        at rx.internal.operators.OperatorDoOnEach$1.onError(OperatorDoOnEach.java:71) ~[rxjava-1.1.1.jar:1.1.1]
        at rx.internal.operators.OperatorDoOnEach$1.onError(OperatorDoOnEach.java:71) ~[rxjava-1.1.1.jar:1.1.1]
        at com.netflix.hystrix.AbstractCommand$HystrixObservableTimeoutOperator$1.run(AbstractCommand.java:955) ~[hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.strategy.concurrency.HystrixContextRunnable$1.call(HystrixContextRunnable.java:41) ~[hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.strategy.concurrency.HystrixContextRunnable$1.call(HystrixContextRunnable.java:37) ~[hystrix-core-1.5.1.jar:1.5.1]
        at com.sap.cloud.yaas.servicesdk.logging.DelegatingMDCCallable.call(DelegatingMDCCallable.java:58) ~[service-sdk-logging-4.7.0.jar:4.7.0]
        at com.netflix.hystrix.strategy.concurrency.HystrixContextRunnable.run(HystrixContextRunnable.java:57) ~[hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.AbstractCommand$HystrixObservableTimeoutOperator$2.tick(AbstractCommand.java:972) ~[hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.util.HystrixTimer$1.run(HystrixTimer.java:99) ~[hystrix-core-1.5.1.jar:1.5.1]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_102]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[na:1.8.0_102]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_102]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[na:1.8.0_102]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_102]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_102]
        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_102]
    13:19:13 [ERROR] [c.t.t.g.database.StandardTitanGraph] [] Could not commit transaction [9] due to exception
    com.thinkaurelius.titan.core.TitanException: Could not execute operation due to backend exception
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:44) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction.persist(CacheTransaction.java:87) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction.flushInternal(CacheTransaction.java:141) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction.commit(CacheTransaction.java:198) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.BackendTransaction.commitStorage(BackendTransaction.java:118) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.commit(StandardTitanGraph.java:718) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.commit(StandardTitanTx.java:1352) [titan-core-1.0.0.jar:na]
        at com.hybris.caas.category.titan.TitanTransactionManager.doCommit(TitanTransactionManager.java:126) [classes/:na]
        at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:761) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:730) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:485) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:291) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) [spring-aop-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:208) [spring-aop-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at com.sun.proxy.$Proxy41.createCategory(Unknown Source) [na:na]
        at com.hybris.caas.category.command.CreateCategoryCommand.run(CreateCategoryCommand.java:81) [classes/:na]
        at com.hybris.caas.category.command.CreateCategoryCommand.run(CreateCategoryCommand.java:35) [classes/:na]
        at com.netflix.hystrix.HystrixCommand$1.call(HystrixCommand.java:293) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.HystrixCommand$1.call(HystrixCommand.java:288) [hystrix-core-1.5.1.jar:1.5.1]
        at rx.Observable$2.call(Observable.java:162) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:154) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:162) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:154) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:162) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:154) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable.unsafeSubscribe(Observable.java:8314) [rxjava-1.1.1.jar:1.1.1]
        at com.netflix.hystrix.AbstractCommand$5.call(AbstractCommand.java:521) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.AbstractCommand$5.call(AbstractCommand.java:497) [hystrix-core-1.5.1.jar:1.5.1]
        at rx.Observable.unsafeSubscribe(Observable.java:8314) [rxjava-1.1.1.jar:1.1.1]
        at rx.internal.operators.OperatorSubscribeOn$1.call(OperatorSubscribeOn.java:94) [rxjava-1.1.1.jar:1.1.1]
        at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:56) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:47) [hystrix-core-1.5.1.jar:1.5.1]
        at com.sap.cloud.yaas.servicesdk.logging.DelegatingMDCCallable.call(DelegatingMDCCallable.java:58) [service-sdk-logging-4.7.0.jar:4.7.0]
        at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction.call(HystrixContexSchedulerAction.java:69) [hystrix-core-1.5.1.jar:1.5.1]
        at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55) [rxjava-1.1.1.jar:1.1.1]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_102]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_102]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_102]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_102]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102]
    Caused by: com.thinkaurelius.titan.diskstorage.PermanentBackendException: Permanent exception while executing backend operation CacheMutation
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:42) ~[titan-core-1.0.0.jar:na]
        ... 40 common frames omitted
    Caused by: com.amazon.titan.diskstorage.dynamodb.BackendRuntimeException: was interrupted during parallelMutate
        at com.amazon.titan.diskstorage.dynamodb.DynamoDBDelegate.parallelMutate(DynamoDBDelegate.java:301) ~[dynamodb-titan100-storage-backend-1.0.0.jar:na]
        at com.amazon.titan.diskstorage.dynamodb.DynamoDBStoreManager.mutateMany(DynamoDBStoreManager.java:194) ~[dynamodb-titan100-storage-backend-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:90) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:87) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:56) ~[titan-core-1.0.0.jar:na]
        ... 41 common frames omitted
    13:19:13 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doRollback()
    13:19:13 [ERROR] [c.h.c.c.t.TitanTransactionManager] [] Commit exception overridden by rollback exception
    com.thinkaurelius.titan.core.TitanException: Could not commit transaction due to exception during persistence
        at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.commit(StandardTitanTx.java:1363) ~[titan-core-1.0.0.jar:na]
        at com.hybris.caas.category.titan.TitanTransactionManager.doCommit(TitanTransactionManager.java:126) ~[classes/:na]
        at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:761) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:730) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:485) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:291) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96) [spring-tx-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) [spring-aop-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:208) [spring-aop-4.2.5.RELEASE.jar:4.2.5.RELEASE]
        at com.sun.proxy.$Proxy41.createCategory(Unknown Source) [na:na]
        at com.hybris.caas.category.command.CreateCategoryCommand.run(CreateCategoryCommand.java:81) [classes/:na]
        at com.hybris.caas.category.command.CreateCategoryCommand.run(CreateCategoryCommand.java:35) [classes/:na]
        at com.netflix.hystrix.HystrixCommand$1.call(HystrixCommand.java:293) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.HystrixCommand$1.call(HystrixCommand.java:288) [hystrix-core-1.5.1.jar:1.5.1]
        at rx.Observable$2.call(Observable.java:162) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:154) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:162) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:154) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:162) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable$2.call(Observable.java:154) [rxjava-1.1.1.jar:1.1.1]
        at rx.Observable.unsafeSubscribe(Observable.java:8314) [rxjava-1.1.1.jar:1.1.1]
        at com.netflix.hystrix.AbstractCommand$5.call(AbstractCommand.java:521) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.AbstractCommand$5.call(AbstractCommand.java:497) [hystrix-core-1.5.1.jar:1.5.1]
        at rx.Observable.unsafeSubscribe(Observable.java:8314) [rxjava-1.1.1.jar:1.1.1]
        at rx.internal.operators.OperatorSubscribeOn$1.call(OperatorSubscribeOn.java:94) [rxjava-1.1.1.jar:1.1.1]
        at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:56) [hystrix-core-1.5.1.jar:1.5.1]
        at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:47) [hystrix-core-1.5.1.jar:1.5.1]
        at com.sap.cloud.yaas.servicesdk.logging.DelegatingMDCCallable.call(DelegatingMDCCallable.java:58) [service-sdk-logging-4.7.0.jar:4.7.0]
        at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction.call(HystrixContexSchedulerAction.java:69) [hystrix-core-1.5.1.jar:1.5.1]
        at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55) [rxjava-1.1.1.jar:1.1.1]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_102]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_102]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_102]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_102]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102]
    Caused by: com.thinkaurelius.titan.core.TitanException: Could not execute operation due to backend exception
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:44) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction.persist(CacheTransaction.java:87) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction.flushInternal(CacheTransaction.java:141) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction.commit(CacheTransaction.java:198) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.BackendTransaction.commitStorage(BackendTransaction.java:118) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.commit(StandardTitanGraph.java:718) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.commit(StandardTitanTx.java:1352) ~[titan-core-1.0.0.jar:na]
        ... 34 common frames omitted
    Caused by: com.thinkaurelius.titan.diskstorage.PermanentBackendException: Permanent exception while executing backend operation CacheMutation
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:42) ~[titan-core-1.0.0.jar:na]
        ... 40 common frames omitted
    Caused by: com.amazon.titan.diskstorage.dynamodb.BackendRuntimeException: was interrupted during parallelMutate
        at com.amazon.titan.diskstorage.dynamodb.DynamoDBDelegate.parallelMutate(DynamoDBDelegate.java:301) ~[dynamodb-titan100-storage-backend-1.0.0.jar:na]
        at com.amazon.titan.diskstorage.dynamodb.DynamoDBStoreManager.mutateMany(DynamoDBStoreManager.java:194) ~[dynamodb-titan100-storage-backend-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:90) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:87) ~[titan-core-1.0.0.jar:na]
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:56) ~[titan-core-1.0.0.jar:na]
        ... 41 common frames omitted
    13:19:13 [INFO ] [c.h.c.c.j.CachedSchemaResolver] [] Resolve schema: resource:/api/schema/CategoryCreation.json
    13:19:13 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doGet()
    13:19:13 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doBegin()
    13:19:14 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doCommit()
    13:19:14 [INFO ] [c.h.c.c.j.CachedSchemaResolver] [] Resolve schema: resource:/api/schema/AssignmentCreation.json
    13:19:14 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doGet()
    13:19:14 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doBegin()
    13:19:14 [INFO ] [c.h.c.c.r.impl.utils.ReferenceVertex] [] Creating vertex for com.hybris.caas.category.model.Reference@2accc739[id=NEW,type=product,url=product-service.dev.cf.hybris.com/NEW]
    13:19:14 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doCommit()
    13:19:14 [INFO ] [c.h.c.c.j.CachedSchemaResolver] [] Resolve schema: resource:/api/schema/AssignmentCreation.json
    13:19:14 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doGet()
    13:19:14 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doBegin()
    13:19:14 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doRollback()
    13:19:14 [INFO ] [c.h.c.c.c.DeleteCategoriesCommand] [] [AUDIT [ remove_all_categories ]] [user=User[name=,roles=[hybris.category_delete_all]], tenant=womdev1s] Delete all categories.
    13:19:14 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doGet()
    13:19:14 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doBegin()
    13:19:14 [INFO ] [c.h.c.c.r.i.GraphCategoryRepository] [] Deleting all categories in tenant 'womdev1s'
    13:19:14 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doCommit()
    13:19:16 [INFO ] [c.h.c.c.j.CachedSchemaResolver] [] Resolve schema: resource:/api/schema/CategoryCreation.json
    13:19:16 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doGet()
    13:19:16 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doBegin()
    13:19:16 [INFO ] [c.h.c.c.r.impl.utils.CategoryVertex] [] Creating root category vertex for tenant 'womdev1s'
    13:19:16 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doCommit()
    13:19:16 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doGet()
    13:19:16 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doBegin()
    13:19:16 [INFO ] [c.h.c.c.t.TitanTransactionManager] [] TRANSACTION - doCommit()
    ......

I know we should expect some issues due to the "eventual consistency". Both CassandraDB and DynamoDB seems to be eventually consistent but in our case the behaviour is significantly different. We are also using "storage.dynamodb.force-consistent-read=true" flag.

Have anybody already tried to face such an issue?

Community
  • 1
  • 1
  • I did more tests and I found that the issue doesn't occur when: - there are no network latency in service-to-dynamodb communication (service deployed on AWS), and - there are no concurrency (only one service client at a time) Otherwise I get ghost vertices. It seems to be titan transaction concurrency problem. No solution so far. – Marcin Koch Sep 27 '16 at 05:27
  • I've updated the question. Finally I managed to catch an exception. I've reproduced the issue on local DynamoDB. – Marcin Koch Sep 30 '16 at 05:40

1 Answers1

0

Index consistency issues you experienced may be rectified by the introduction of Titan/JanusGraph based locking (practical example). Regardless of backend choice, ghost vertices are always a topic in Titan/JanusGraph, and the guidance on that front is to have OLAP jobs that scan the graph and remove the inconsistencies.

Alexander Patrikalakis
  • 5,054
  • 1
  • 30
  • 48