# Remote commands with Zabbix actions


With `monit` services are restarted, ever since I've installed `zabbix` I
wanted the same functionallity. Turns out this is possible, but it
takes some configuration.

Also see the 
[zabbix manual](http://www.zabbix.com/downloads/ZABBIX%20Manual%20v1.6.pdf), 
where it gets interesting from page 160 onwards.

In zabbix go to Configuration->Actions.

Add a new 'Action Operation' in which you want to run a remote command.

1. Operation type: *remote command*
2. Remote command: host:script in my (test) case
    `elektron:/home/miekg/bin/zabbix_service {TRIGGER.NAME}: {STATUS}`

And `zabbix_service` is now a shell script which will `echo` its
arguments to a file in `/tmp`.

Also be sure to set `EnableRemoteCommands=1` in `zabbix_agentd.conf` and
restart `zabbix`.

When enabled I do see something in `/tmp`:

    % ls /tmp/zabbix*
    -rw-rw-r-- 1 zabbix zabbix 144 2009-08-20 11:04 /tmp/zabbix_test
    % cat /tmp/zabbix*
    SSH server is down on elektron: ON
    Sshd is not running on elektron: ON

So this is starting to work nicely, however there a a few *issues* with
it. The script:

* runs only on the host specified (here: `elektron`);
* runs under the user `zabbix`;
* needs to parse its arguments.

# host groups
Reading from the manual you can use the syntax:

    hostgroup#command

instead of 
    
    host:command

So (in my case) using `atoom#` should fix creating actions for all my
hosts.

# Running privileged command
At page 162 it say:

>One may be interested in using sudo to give access to privileged
>commands.

So it must be done with `sudo`.

# Parsing the argument
Are there any other macros (page 87 in the manual) which can be of use?
Looking at some:

    {TRIGGER.ID}    Numeric trigger ID which triggered this action.
    {TRIGGER.KEY}   Key of first item of the trigger which caused a notification.

I've added these to my little test script, let's see what comes out of
it.

    SSH server is down on elektron: ON 13009 net.tcp.service[ssh]
    Sshd is not running on elektron: ON 13014 proc.num[sshd]

Indeed a number is added (`13009`) and a `net.tcp.service[ssh]` string.
That is somewhat more easy to parse, but still...

From the looks of it, the `{TRIGGER.ID}`s differs *per host*, so you
cannot use them to check the failure of (say) the SSH daemon for *all*
hosts. The `{TRIGGER.KEY}` looks much more portable and parseable in that
respect.

> In case you are interested the trigger ids can be found by going to
> the trigger screen of zabbix and clicking on a trigger. In the URL
> it has a `triggerid=xxxxx` string.

I finally went with the following macros:

    atoom#/home/miekg/bin/zabbix_service "{TRIGGER.KEY}:{STATUS}:{HOSTNAME}"

Which gives the following output:

    proc.num[sshd]:OFF:elektron

...and we have a string I can parse! :)

# Todo

I've left to following items on my todo list

* Configure `sudo` to give zabbix more powers - so that it is allowed to restart services;
* Write a proper script, which *can* restart a service;
* Flap detection;
* Failure detection, stop restarting after `n` tries.

