0

Environment details:

  • Cent OS 6.7
  • MarkLogic 8.0.4
  • 3 node master cluster for ingestion only via XQuery based rest api.
  • 3 node replicated cluster for fetch only via XQuery based rest api.

Issue:

We are observing severe degradation in ingestion performance when fetching and ingestion load tests are executed together. Ingestion throughput is severely degraded by more than 50% with single user and 50 document batch size. There are time out errors when number of users increased to 5, 10 with ingestion rates dipping further.

Investigation done:

  • MarkLogic monitoring interface is not revealing any bottleneck. Most of the resources are under utilized during ingestion only testing. When fetching is triggered then there is further dip in resource utilization.
  • Non blocking timestamp of master cluster did not reveal any lag.
  • There is no impact of changing database replication Lag time and Queue size from master.

This is a big concerning issue as we do not expect such degradation on master cluster due to data fetch operations on replicated cluster. I would appreciate any help or pointers to further investigate this problem. Let me know if any further details are needed.

Thank and Regards, Gaurav

Ankit Bhardwaj
  • 754
  • 8
  • 27
Gaurav Gupta
  • 314
  • 1
  • 9
  • 1
    Can you provide more details on how you are doing ingest and fetch of data? Large transactions, retention on a single doc or dir, merges in the background, bad queries with a lot of filtering, all things that could affect performance.. – grtjn Apr 22 '16 at 06:54
  • Document size is less than 6kb. Ingestion is happening in xdmp transaction. Fetch is based on very simple cts search query. Everything runs ok in isolation. – Gaurav Gupta Apr 22 '16 at 09:59
  • It could be locking. Can you share the code that you are using. Are you querying any documents on the ingestion query? – Tyler Replogle Apr 22 '16 at 11:26
  • Ingestion documents are different from query. Also it is possible that fetch from replicated cluster locking write on master server. – Gaurav Gupta Apr 22 '16 at 16:47
  • Your question is too vague and there are too many possibilities to make a useful suggestion. We will need more concrete examples of the data and queries you are using. – wst Apr 23 '16 at 15:47
  • @wst: Why is this question too vague? I am not looking for the solution and just looking for the pointers on where to look for the problem. It is not possible for me to share the exact queries as it goes through custom XQuery layer where lot of logic is injected in between. All I want to understand is what factors can affect data insertion in master cluster when search load is added on replicated cluster. – Gaurav Gupta Apr 25 '16 at 06:27

0 Answers0