0

we've a vm in Azure with D11(2 cores, 14GB ram) size to host our company new web erp system that consist of jboss and postgresql installed on centos 7, only 14 users login concurrently.

From time to time we experienced slow response, user would need to wait for a minute or sometimes few minutes to load a page. checked the memory using free -h, there are more than 8GB free, checked the CPU usage in azure portal which only constantly below 10%. However, the load aveerage increased during the delayed response.

when load average is below 1.0, the web application response fast and become very unresponsive when the load average rise above 1.0. I check the iotop and realize that unresponsive web app normally occur during high postgresql update and commit. May I know what is meant by 99.99% under IO> for Postgres: jboss wsemp 127.0.0.1(40291) COMMIT ? is this where the bottleneck happened? another 99.99% that caught my eye is Postgres: checkpointer process i.imgur.com/XINJhwN.png

Ken
  • 13
  • 2

1 Answers1

0

This sounds like it could easily be checkpoint activity, yes. Configure PostgreSQL to do spread checkpoints. See:

Make sure that checkpoint_segments is high enough for your workload and set checkpoint_completion_target to something like 0.8 to encourage most of the checkpoint work to be done early. This will reduce overall performance by reducing the amount of write combining that can be done, but will smooth activity out to reduce stalls.

Craig Ringer
  • 11,083
  • 9
  • 40
  • 61