I have a zabbix server and an agent, on two different computers. The agent runs in active mode, e.g. I have this in the config file:
StartAgents=0
ServerActive=my.zabbix.server.com
Hostname=my.zabbix.agent.com
The zabbix server can be reached from the machine with the agent, e.g.:
telnet my.zabbix.server.com 10051
Trying 111.111.111.111...
Connected to my.zabbix.server.com.
Escape character is '^]'.
Connection closed by foreign host.
Moreover, the host auto registration was turned on on the server, and the agent has successfuly registered the host when I first started it. So the connection must be alive. This is what I see in the agent's log when I start it:
83074:20171128:082440.324 Starting Zabbix Agent [my.zabbix.agent.com]. Zabbix 3.4.1 (revision 71734).
83074:20171128:082440.324 **** Enabled features ****
83074:20171128:082440.324 IPv6 support: YES
83074:20171128:082440.324 TLS support: YES
83074:20171128:082440.324 **************************
83074:20171128:082440.324 using configuration file: /usr/local/etc/zabbix34/zabbix_agentd.conf
83074:20171128:082440.324 agent #0 started [main process]
83076:20171128:082440.325 agent #1 started [collector]
83077:20171128:082440.326 agent #2 started [active checks #1]
In other words, the agent could connect to the server, it even recognized its version. Nothing else happens in the agent log.
On the server, it still says that the host is unreachable!
What could be the problem?
UPDATE: on the front end, I see this message:
I'm not sure why it wants to connect to 10050? It is used for passive agents. My agent should be active.
UPDATE2: If I delete the host from the zabbix server, and restart the agent, then the following happens:
The host is auto-registered on the server again. The agent log:
14551:20171128:193954.483 Starting Zabbix Agent [my.zabbix.server.com]. Zabbix 3.4.1 (revision 71734).
14551:20171128:193954.484 **** Enabled features ****
14551:20171128:193954.484 IPv6 support: YES
14551:20171128:193954.484 TLS support: YES
14551:20171128:193954.484 **************************
14551:20171128:193954.484 using configuration file: /usr/local/etc/zabbix34/zabbix_agentd.conf
14551:20171128:193954.484 agent #0 started [main process]
14553:20171128:193954.485 agent #1 started [collector]
14554:20171128:193954.485 agent #2 started [active checks #1]
14554:20171128:193954.614 no active checks on server [my.zabbix.server.com:10051]: host [my.zabbix.agent.com] not found
where:
- my.zabbix.server.com is the server's FQDN
- my.zabbix.agent.com is the agent's FQDN, and also the HostName parameter in the agent's config.
So it seems, that the agent registers the host successfully, but for some reason, the server tries to get the information from the agent in passive mode. Despite the fact, that the agent was configured in active mode.
UPDATE 3: Although the agents are sending data, the host list still shows a problem:
Availability/ZBX has a red flag, and a message saying that "Get value from agent failed: cannot connect to [[ip_address_here]:1050]: [4] interrupted system call". I have checked every single item and every discovery for these hosts, and all of them have type="Zabbix Agent Active". So I don't understand why the server is trying to connect to them in passive mode??? This does not cause a real "problem" (e.g. something that generates an action and sends out notifications from the zabbix server), but it is very disturbing to see red flags on the screen.
Until this problem is fully solved, I won't even accept my own answer.
UPDATE 4: after changing all item types, discovery types, and the types of the item prototypes of the low level discovery rules of all templates that are connected to my hosts, and all of the templates linked form there, the ZBX red flags finally disappeared. I believe that I'm an experienced software user, but it was quite difficult to understand what is going on, and change all of the parameters to make it work.