Corda RPC communication. Slow Performance

Question

Have a scenario when a client triggers cordapp via RPC and waits for the result.

rpcConnection.proxy
.startFlow(::ImportAssetFlow, importDto)
.returnValue
.get(importTimeout /* 30000 ms */, TimeUnit.MILLISECONDS)

Flow triggers correctly and returns the response, but the problem with the slow response after the flow is processed. In the end of the FlowLogic.call() codeblock, the response should came back to the client via Artemis messsage. The result returns back on a client via Corda future after 12 seconds.

On the client side on debug lvl log of the RPCClientProxyHandler to check how the process works:

2020-01-08 12:12:45.982 DEBUG [,,,] 78798 --- [global-threads)] n.c.c.r.internal.RPCClientProxyHandler   : Got message from RPC server RpcReply(id=fc317c4a-3de4-4936-b4c3-768b8b727245, timestamp: 2020-01-08T10:12:44.237Z, entityType: Invocation, result=Success(FlowHandleImpl(id=[16566124-f7d2-41cf-b3a4-f86846073632], returnValue=net.corda.core.internal.concurrent.CordaFutureImpl@58f8aa01)), deduplicationIdentity=e3f6d696-dea4-45b0-95b8-f9c0fe363a9f)
2020-01-08 12:12:45.986 DEBUG [,,,] 78798 --- [global-threads)] n.c.c.r.internal.RPCClientProxyHandler   : Got message from RPC server Observation(id=b3f0b064-6d82-4900-85e6-e70b7d00926a, timestamp: 2020-01-08T10:11:26.411Z, entityType: Invocation, content=[rx.Notification@b461fac0 OnNext Added(stateMachineInfo=StateMachineInfo([16566124-f7d2-41cf-b3a4-f86846073632], ***.workflow.asset.flow.ImportAssetFlow))], deduplicationIdentity=e3f6d696-dea4-45b0-95b8-f9c0fe363a9f)
2020-01-08 12:12:45.987 DEBUG [,,,] 78798 --- [global-threads)] n.c.c.r.internal.RPCClientProxyHandler   : Got message from RPC server Observation(id=12887a04-f22c-422d-b684-c679f137d66b, timestamp: 2020-01-08T10:12:45.979Z, entityType: Invocation, content=[rx.Notification@4c59250 OnNext Starting], deduplicationIdentity=e3f6d696-dea4-45b0-95b8-f9c0fe363a9f)
2020-01-08 12:12:58.603 DEBUG [,,,] 78798 --- [global-threads)] n.c.c.r.internal.RPCClientProxyHandler   : Got message from RPC server Observation(id=b83c15ca-9047-4958-a106-65165e5abfbd, timestamp: 2020-01-08T10:12:45.975Z, entityType: Invocation, content=[rx.Notification@e03cfa2d OnNext [B@2dceac3d], deduplicationIdentity=e3f6d696-dea4-45b0-95b8-f9c0fe363a9f)
2020-01-08 12:12:58.605 DEBUG [,,,] 78798 --- [global-threads)] n.c.c.r.internal.RPCClientProxyHandler   : Got message from RPC server Observation(id=b83c15ca-9047-4958-a106-65165e5abfbd, timestamp: 2020-01-08T10:12:45.975Z, entityType: Invocation, content=[rx.Notification@15895539 OnCompleted], deduplicationIdentity=e3f6d696-dea4-45b0-95b8-f9c0fe363a9f)

There is a big gap between events

12:12:45.987 OnNext Starting - the start of the flow which consumes 1k objects
12:12:58.603 OnNext [B@2dceac3d] - the actual result of the operation. So it's ~12.5s to return the response.

According to Jprofiler Corda processed flow in ~1.3s and send the result back.

What can be a reason for such behavior, slow journaling of artemis messages?

UPD: Found that Corda has a suspend/resume mechanism (checkpoints), to store the thread state to the Disk and in the future read it again and resume this thread. net.corda.node.services.statemachine.FlowStateMachineImpl#run that triggers co.paralleluniverse.fibers.Fiber#parkAndSerialize. Seems it's one of the time consumers.

Thank you a lot in advance

score 0 · Answer 1 · answered Jan 21 '20 at 16:57

0

There can be a number of causes:

JProfiler might not be reporting the flow duration correctly. Other profiling tools certainly have difficulty with the parking that Fibers do, as each resume from a suspend looks like a distinct method call. I would add some logging to your flow to check how long it really takes.
If you return a really large result from your flow (e.g. a large collection of large objects), the serialization and deserialization of that over RPC can take a reasonable amount of time. So are you returning a large result?

answered Jan 21 '20 at 16:57

Rick Parker

11
3

Hi Rick, 2. Nope, the result is just a meta-information about these objects, how they were processed in general. Compared to input Json msg size the input is 2MB, and the output is ~800B – Oleksandr Abasov Jan 22 '20 at 14:58
raised a ticket here, with the detailed logs: https://r3-cev.atlassian.net/servicedesk/customer/portal/2/SUP-1084 – Oleksandr Abasov Jan 22 '20 at 15:50

score 0 · Accepted Answer · answered Jan 23 '20 at 13:41

Found the root cause of the delay, it seems that all the entityManager transactions flushes at the end of the flow, but not after the end of the block withEntityManager as I expected

fun save(assetBatch: PCAssetBatch): PCAssetBatch {
    return serviceHub.withEntityManager {
        persist(assetBatch)
        logger.debug("Batch [${assetBatch.batchId} - ${assetBatch.batchName}] has been persisted")
        assetBatch
    }
}

It took a while to understand that, cause of log I believed that the data was persisted. It’s because of the entity with OneToMany annotation PCAssetBatch contains column List of PCAssetBatchItem, it seems that hibernate persist them one by one not by using batching. The delay was resolved by fixing this part of the code.

Corda RPC communication. Slow Performance

2 Answers2