Configuring an initd Service for node_exporter

I recently wrote an article showing how to configure Prometheus and Grafana for easy metrics collection. In that article, I assumed that the system which should be monitored would use the systemd approach for defining services.

I now had to set up the node_exporter utility on a system which uses the initd approach. Thus, I provide some simple instructions here on how to accomplish that.


wget https://github.com/prometheus/node_exporter/releases/download/v0.15.2/node_exporter-0.15.2.linux-amd64.tar.gz

  • Extract the archive

tar xvfz node_exporter-*.tar.gz

  • Create a link

ln -s node_exporter-* node_exporter

  • Create the file /opt/node_exporter/node_exporter.sh and add the following content:

#!/bin/sh

/opt/node_exporter/node_exporter --no-collector.diskstats


#!/bin/sh
### BEGIN INIT INFO
# Provides: node_exporter
# Required-Start: $local_fs $network $named $time $syslog
# Required-Stop: $local_fs $network $named $time $syslog
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Description: 
### END INIT INFO

SCRIPT=/opt/node_exporter/node_exporter.sh
RUNAS=root

PIDFILE=/var/run/node_exporter.pid
LOGFILE=/var/log/node_exporter.log

start() {
if [ -f /var/run/$PIDNAME ] && kill -0 $(cat /var/run/$PIDNAME); then
echo 'Service already running' >&2
return 1
fi
echo 'Starting service' >&2
local CMD="$SCRIPT &> \"$LOGFILE\" & echo \$!"
su -c "$CMD" $RUNAS > "$PIDFILE"
echo 'Service started' >&2
}

stop() {
if [ ! -f "$PIDFILE" ] || ! kill -0 $(cat "$PIDFILE"); then
echo 'Service not running' >&2
return 1
fi
echo 'Stopping service' >&2
kill -15 $(cat "$PIDFILE") && rm -f "$PIDFILE"
echo 'Service stopped' >&2
}

uninstall() {
echo -n "Are you really sure you want to uninstall this service? That cannot be undone. [yes|No] "
local SURE
read SURE
if [ "$SURE" = "yes" ]; then
stop
rm -f "$PIDFILE"
echo "Notice: log file is not be removed: '$LOGFILE'" >&2
update-rc.d -f  remove
rm -fv "$0"
fi
}

case "$1" in
start)
start
;;
stop)
stop
;;
uninstall)
uninstall
;;
retart)
stop
start
;;
*)
echo "Usage: $0 {start|stop|restart|uninstall}"
esac

  • Make both files executable

chmod +x /etc/init.d/node_exporter

chmod +x <em>/opt/node_exporter/node_exporter.sh</em>

  • Test the script

/etc/init.d/node_exporter start

/etc/init.d/node_exporter stop

  • Enable start with chkconfig

chkconfig --add node_exporter

All done! Now you can configure your Prometheus server to grab the metrics from the node_exporter instance.

Setting up Prometheus and Grafana for CentOS / RHEL 7 Monitoring

As mentioned in my previous post, I have long been looking for a centralised solution for collecting logs and monitoring metrics. I think my search was unsuccessful since I was looking for too many things in one solution. Instead I found now two separate solutions, one for log management (using Graylog) and one for metrics (using Prometheus and Grafana). I deployed both of these on very inexpensive VPS machines and so far I am very happy with them.

In this post, I provide some pointers how to set up the metrics solution based on Prometheus and Grafana. I assume you are using a RHEL / CentOS system as the server hosing Prometheus and Grafana and you are interested in the OS metrics for a CentOS system. This tutorial will guide you through setting up the Prometheus server, collecting metrics for it using node_exporter and finally how to create dashboards and alerts using Grafana for the collected metrics.

Installing Prometheus Server

  • Follow the excellent instructions here with the following modifications.
    • Make sure to download the latest version of Prometheus (the link can be obtained on the Prometheus download page, this guide works with version 2.1.0)
    • For the systemd service, use the following file:

[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
User=root
Restart=on-failure
ExecStart=/opt/prometheus/prometheus --config.file=/opt/prometheus/prometheus.yml
[Install]
WantedBy=multi-user.target

  • If you are using a firewall, add the following rule to /etc/sysconfig/iptables and restart service iptables:

-A INPUT -p tcp -m state --state NEW -m tcp --dport 9090 -j ACCEPT

Viewing a Metric

You should the data of http requests served by the Prometheus server itself. If you click on the tab graph, you can see the data as a graph.

Installing Node Exporter on the Server to be Monitored

tar xvfz node_exporter-*.tar.gz
  • Create link (replace 0.15.2 with the version you have downloaded)

ln -s node_exporter-0.15.2.linux-amd64 node_exporter

  • Define a service for node_exporter in /etc/systemd/system/node_exporter.service

(or if you are using init.d, please see this article).


[Unit]
Description=Node Exporter

[Service]
User=root
ExecStart=/opt/node_exporter/node_exporter --no-collector.diskstats

[Install]
WantedBy=default.target

(The –no-collector.diskstats is added above since diskstats often does not work in virtualized environments. If that is not an issue for you, be free to leave this argument out.)

  • Enable and start service

systemctl daemon-reload
systemctl start node_exporter
systemctl enable node_exporter

  • Tell Prometheus to scrape these metrics by adding the following to /opt/prometheus/prometheus.yml
 - job_name: "node"
static_configs:
- targets: ['localhost:9100']

Now if you got to yourserver.com:9090/graph you can for instance enter the expression node_memory_MemFree and see the free memory available on the server.

You can also install node_exporter on another server. Simply point the job definition then to this servers address; and of course remember to open port 9100 on the server.

Installing Grafana

The default Prometheus interface is quite basic. Thankfully Grafana offers excellent integration with Prometheus and will result in a much nicer UI.

You can easily install Grafana on your own server or use a free cloud-based instance (limited to one user and five dashboards).

To install Grafana locally:

  • First follow these instructions.
  • Graphana by default runs on port 3000, so make sure you add the following firewall rule after you install it:

-A INPUT -p tcp -m state --state NEW -m tcp --dport 3000 -j ACCEPT

  • In the file /etc/grafana/grafana.ini, provide details for an SMTP connection which can be used for sending emails (section [smtp]).
  • Also update the host name in the field domain to the address at which your server can be reached on the internet.

Configuring Grafana

  • Go to yourserver.com:3000
  • The default login is username ‘admin’ and password ‘admin’. Create a new user with a good password and delete the admin user.
  • First connect with your Prometheus instance as a data source.
  • Then go to Dashboards and select import

dashboards

Done! You should now be able to see the metrics for your server such as CPU usage or free memory.

dashboard

If you monitor multiple servers, you can switch between them by clicking next to the text ‘node’.

switch

Additional servers will appear here if you add them to the Prometheus configuration:

- job_name: "node"
 static_configs:
 - targets: ['localhost:9100', 'xxxxx']

Configure Alerting

While Prometheus has some build in alerting facilities, alerting in Grafana is much easier to use. To set up altering for the dashboard you have created:

  • Go to Alerting / Notification Channels
  • Click on New Channel
  • Provide a name for the channel and your email address and click Save.

channel

  • Next go to the dashboard you have created: Node Exporter Server Metrics
  • Click on the first panel to select it

panel

  • Next click on Edit in the menu which is shown above the panel
  • Go to the Alert tab and click on Create Alert

create alert

  • Configure the following condition(for more details about this, please see this page):

condition

  • Select the Notification page on the right
  • In Send to, select the notification channel you have created earlier.
  • Provide a message such as: CPU usage high.
  • Save the dashboard (Ctrl+S)

Done! You should now receive notifications if the CPU usage on any of the servers monitored on this dashboard is too high.

Further Reading