I have been struggling with regular usage of the avahi-browse command on all CentOS 7 servers placed in a large network with many devices reporting out avahi/mdns data.
Part of an application run this command:
avahi-browse -ltrp .._tcp
I am able to repeat the issue on the command line if I run this command enough times via a bash script.
What seems to occur is that -t / --terminate randomly doesn't actually terminate and we sit there forever and ever, resulting in the application stalling for a bit.
I put in a manual timeout of 15 seconds and we hit this occasionally at all times.
However once we hit 100-120+ networked devices reporting their info via avahi/mdns this starts to happen very frequently and gets worse the more devices on the network. At about 100-120 devices it starts to happen a lot, and gets worse the more devices we have.
It probably doesn't matter too much but here's some details about the infra...
/20 network, 10GigE, there are other devices besides the ones I'm interested in but we're not searching for them via avahi, and most don't report out anything
16 vCPU x 32GiB vMem VM
I created an issue on github 4 months ago (no responses yet), https://github.com/lathiat/avahi/issues/264.
I thought I'd ask here if there are any ideas, any system configuration related that could be a factor I should look at, or if anyone has encountered this. I'm not even sure if many people are using zeroconf in a large enterprise environment.
avahi config:
[server]
use-ipv4=no
use-ipv6=yes
allow-interfaces=ens256
deny-interfaces=ens192,ens224
enable-dbus=yes
disallow-other-stacks=yes
objects-per-client-max=2048
ratelimit-interval-usec=1000000
ratelimit-burst=1000
cache-entries-max=2048
[wide-area]
enable-wide-area=no
[publish]
[reflector]
[rlimits]
rlimit-core=0
rlimit-data=4194304
rlimit-fsize=0
rlimit-nofile=768
rlimit-stack=4194304
rlimit-nproc=3