If the ELK server is down for a long time (an hour or more) could that cause stability issues for the server sending logs?
We are using Encoder version 7.2, Java 17, AWS Linux 2, sending somewhere around 650 requests/second distributed across 3 servers (so each one sending about 230/sec thru logstash-logback-encoder). Logback config:
<appender name="stash" class="net.logstash.logback.appender.LogstashUdpSocketAppender">
<host>elk.mycorp</host>
<port>{{ logstash_udp_port }}</port>
<layout class="net.logstash.logback.layout.LogstashLayout"/>
</appender>
Our ELK server was down for over an hour, and during that time we had problems with the servers (the ones on the same subnet as ELK showed massively increased response times, and 700,000 open file handles; curiously, servers on a different subnet had no problems).
The Encoder should drop messages if the ring buffer becomes full. But I wonder what's going on when the remote server is offline. Is it possible that the ELK server going offline caused our servers to fail because ELK could not be reached? If so, any suggestions on how to make our setup more robust?