1

I have a zabbix server and an agent, on two different computers. The agent runs in active mode, e.g. I have this in the config file:

StartAgents=0
ServerActive=my.zabbix.server.com
Hostname=my.zabbix.agent.com

The zabbix server can be reached from the machine with the agent, e.g.:

telnet my.zabbix.server.com 10051
Trying 111.111.111.111...
Connected to my.zabbix.server.com.
Escape character is '^]'.
Connection closed by foreign host.

Moreover, the host auto registration was turned on on the server, and the agent has successfuly registered the host when I first started it. So the connection must be alive. This is what I see in the agent's log when I start it:

83074:20171128:082440.324 Starting Zabbix Agent [my.zabbix.agent.com]. Zabbix 3.4.1 (revision 71734).
83074:20171128:082440.324 **** Enabled features ****
83074:20171128:082440.324 IPv6 support:          YES
83074:20171128:082440.324 TLS support:           YES
83074:20171128:082440.324 **************************
83074:20171128:082440.324 using configuration file: /usr/local/etc/zabbix34/zabbix_agentd.conf
83074:20171128:082440.324 agent #0 started [main process]
83076:20171128:082440.325 agent #1 started [collector]
83077:20171128:082440.326 agent #2 started [active checks #1]

In other words, the agent could connect to the server, it even recognized its version. Nothing else happens in the agent log.

On the server, it still says that the host is unreachable!

What could be the problem?

UPDATE: on the front end, I see this message:

ZBX red error message

I'm not sure why it wants to connect to 10050? It is used for passive agents. My agent should be active.

UPDATE2: If I delete the host from the zabbix server, and restart the agent, then the following happens:

The host is auto-registered on the server again. The agent log:

14551:20171128:193954.483 Starting Zabbix Agent [my.zabbix.server.com]. Zabbix 3.4.1 (revision 71734).
14551:20171128:193954.484 **** Enabled features ****
14551:20171128:193954.484 IPv6 support:          YES
14551:20171128:193954.484 TLS support:           YES
14551:20171128:193954.484 **************************
14551:20171128:193954.484 using configuration file: /usr/local/etc/zabbix34/zabbix_agentd.conf
14551:20171128:193954.484 agent #0 started [main process]
14553:20171128:193954.485 agent #1 started [collector]
14554:20171128:193954.485 agent #2 started [active checks #1]
14554:20171128:193954.614 no active checks on server [my.zabbix.server.com:10051]: host [my.zabbix.agent.com] not found

where:

  • my.zabbix.server.com is the server's FQDN
  • my.zabbix.agent.com is the agent's FQDN, and also the HostName parameter in the agent's config.

So it seems, that the agent registers the host successfully, but for some reason, the server tries to get the information from the agent in passive mode. Despite the fact, that the agent was configured in active mode.

UPDATE 3: Although the agents are sending data, the host list still shows a problem:

enter image description here

Availability/ZBX has a red flag, and a message saying that "Get value from agent failed: cannot connect to [[ip_address_here]:1050]: [4] interrupted system call". I have checked every single item and every discovery for these hosts, and all of them have type="Zabbix Agent Active". So I don't understand why the server is trying to connect to them in passive mode??? This does not cause a real "problem" (e.g. something that generates an action and sends out notifications from the zabbix server), but it is very disturbing to see red flags on the screen.

Until this problem is fully solved, I won't even accept my own answer.

UPDATE 4: after changing all item types, discovery types, and the types of the item prototypes of the low level discovery rules of all templates that are connected to my hosts, and all of the templates linked form there, the ZBX red flags finally disappeared. I believe that I'm an experienced software user, but it was quite difficult to understand what is going on, and change all of the parameters to make it work.

nagylzs
  • 759
  • 3
  • 12
  • 23
  • 4 years passed, and this problem still exists in Zabbix (latest main version 5 at the moment). I think that my observations are valid, but it seems that they will never change it. :-( – nagylzs Jan 05 '22 at 13:31

3 Answers3

1

For active checks to work, agent hostname must match the host hostname in the Zabbix server. "Agent hostname" is not the system hostname necessarily - it depends on the configuration parameters "Hostname" and "HostnameItem". Host hostname in Zabbix is not the DNS or the IP address - it's the contents of the "Host" field in the host properties.

When agent starts, it prints out the hostname it is sending to the server. In your example: Starting Zabbix Agent [my.zabbix.server.com] - that is, agent identifies itself to the server as my.zabbix.server.com. Make this value match the proper hostname (note that it is case sensitive) and active checks will start working. Note that the other host could have incorrect values if two or more agents were sending data, identifying as it.

Note that the version printed in the agent log is the agent version, not the server version - agent cannot determine server version.

Richlv
  • 2,354
  • 1
  • 13
  • 18
  • I'l sorry, I wanted to hide the real host names, and placed the wrong info in the question. :-( It actually starts up like this: `84458:20171128:084959.104 Starting Zabbix Agent [my.agent.name.com]. Zabbix 3.4.1 (revision 71734).` Changed the question accordingly. – nagylzs Nov 28 '17 at 07:52
  • Oh, I see. But please check the name agent prints and that you have as a name for the host in the Zabbix frontend - they must match exactly. And they are case sensitive. Additionally, what items do you have on that host, and what is the exact value in the "Type" column for all of them? – Richlv Nov 28 '17 at 12:36
  • Double checked, the agent writes [FQDN], the host has the same name FQDN as the host name, and it is also a valid FQDN. The host was auto-registered, it is assigned to the "Zabbix Agent" template. I did not add any extra items. The template has 3 items by default: agent ping, host name check and zabbix agent version check. All of these items are enabled, and all of them have the type "Zabbix agent" – nagylzs Nov 28 '17 at 18:35
  • Updated the question, I have more details available. – nagylzs Nov 28 '17 at 18:38
  • I have figured out it by myself. I'm going to post a full answer, but I'll also upvote you. – nagylzs Nov 28 '17 at 21:12
1

Short answer: the problem was that all Items had Type="Zabbix Agent" instead of Type="Zabbix Agent Active".

Long answer: a host will either be an active agent or a passive agent. (Well maybe if you try to start two agents on the same machine then you could do both on a single host, but it seems pointless.) Logical, right?

So in reality, being active or passive is a property of the host, not the item. Despite this fact, the mode of data collection (e.g. passive or active) is bound to the item, not the host. I see this as a design flaw in zabbix. This is very counter intuitive. They ONLY way I could overcome this problem is this:

  • Create a full clone for all of the templates that you want to use. Very important to make a full clone instead of a simple clone. For example, create a full clone of the "Linux OS" and a full clone of the "FreeBSD OS" templates, and mass update all of their item types from "Zabbix Agent" to "Zabbix Agent (Active)". Also need to update all discovery rule types from "Zabbix Agent" to "Zabbix Agent (active)". You also need to go over discovery rule items, click on their "Item prototypes", and change the type of all item prototypes from "Zabbix Agent" to "Zabbix Agent (Active)"
  • Also need to make sure that there are no linked (parent) templates. If there are, you need to recursively create a full clone of those, then unlink + clear the old parent, and link the new parent. For example, if you mass update all item types in "Linux OS", then it will not update the so called "Templated items" that are used from the linked "Zabbix Agent" template. So you need to fully clone "Zabbix Agents" into "Zabbix Agents Active", update all item types to active, then re-link (e.g. unlink+clear "Zabbix Agents" template from "Linux OS", then link the newly created "Zabbix Agents Active")
  • This is recursive: you need to repeat all of these steps until all items and all templated items contain "Active" mode instead of "Passive mode"

You need to clone almost all templates in the system. You cannot have a single trigger for a single item that is independent of the item type, because there is no such thing. If you want to change something in an environment where there are passive and active agents mixed, then you have to do everything twice.

Finally, when you add a host, you need to assign the active or the passive template version, depending on which mode you want to use for that particular host.

All of this because the active/passive mode cannot be a property of the host. It must be a property of the item. It is worse than that: it is also the property of discovery rules, and the prototypes of discover rules (and prototype items cannot be mass updated, you have to do this one by one,by hand). Seriously, an item like "cpu.load" is absolutely unrelated to how the data was collected. I mean come on, you could change your mind and switch from active mode to passive, or back. This should not force you to delete all the old items, create new ones. But if you decide to do so, you will loose all history, because you are not just changing the items, you are replacing them. This is really really annoying!

I hope that they will fix this in the upcoming 4.0 version.

nagylzs
  • 759
  • 3
  • 12
  • 23
  • 1
    A frequently missed thing - make sure to change all low level discovery (LLD) rules _and_ item prototypes in them. There is no mass update for prototypes, you have to change them one by one. – Richlv Nov 29 '17 at 14:22
  • What is a prototype in zabbix? I can only see templates, linked templates and item types. BTW I have changed all item types and all discovery types to active in all assigned templates and their linked templates too, but the ZBX flag is still red, and the server still wants to connect in passive mode. – nagylzs Nov 29 '17 at 18:38
  • 1
    Please that page in the manual on LLD. Prototypes are like "mini templates" for items, triggers etc. You can access them by opening the LLD rule listing and then next to each rule you'll have links to prototypes. Item prototypes is what you want to change in this case. – Richlv Nov 29 '17 at 21:36
  • Well, this is even more annoying. Do I really have to create a copy of all discovery rule prototypes? Seriously? – nagylzs Nov 30 '17 at 13:23
  • Okay, now it works. I'm posting this as a question on the official zabbix forum too - it should not be this hard to do, and it is not clear why all of the default templates are using passive mode... – nagylzs Nov 30 '17 at 14:03
  • Note that you don't copy the prototypes individually, they were cloned along with the template. You just change their type. Having a per-host active/passive flag would be useful, but not available yet. There's not that much benefit from posting this on the Zabbix forums, as the fact is well known and there are feature requests to improve this. As for why the default templates are passive - one big reason is that the passive mode is much easier to get working. – Richlv Dec 03 '17 at 20:07
  • https://stackoverflow.com/questions/47612228/using-zabbix-monitor-a-vnf-running-on-a-private-ip-in-openstack is also very similar . – Richlv Dec 03 '17 at 20:30
  • Most of my agents are behind varoius nat firewalls and they are managed by others. For me, setting up passive mode would be much more difficult and fragile. – nagylzs Dec 03 '17 at 21:46
  • Sure, I meant the general case. When specific network limitations are considered, things change - for example, proxy is more complicated to set up, but for remote locations it often becomes the easiest option. – Richlv Dec 03 '17 at 23:14
-2

change the value of /etc/zabbix/zabbix_agentd.conf and put the ip adress of the zabbix