Setting up Prometheus and Grafana for CentOS / RHEL 7 Monitoring

As mentioned in my previous post, I have long been looking for a centralised solution for collecting logs and monitoring metrics. I think my search was unsuccessful since I was looking for too many things in one solution. Instead I found now two separate solutions, one for log management (using Graylog) and one for metrics (using Prometheus and Grafana). I deployed both of these on very inexpensive VPS machines and so far I am very happy with them.

In this post, I provide some pointers how to set up the metrics solution based on Prometheus and Grafana. I assume you are using a RHEL / CentOS system as the server hosing Prometheus and Grafana and you are interested in the OS metrics for a CentOS system. This tutorial will guide you through setting up the Prometheus server, collecting metrics for it using node_exporter and finally how to create dashboards and alerts using Grafana for the collected metrics.

Installing Prometheus Server

  • Follow the excellent instructions here with the following modifications.
    • Make sure to download the latest version of Prometheus (the link can be obtained on the Prometheus download page, this guide works with version 2.1.0)
    • For the systemd service, use the following file:

[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
User=root
Restart=on-failure
ExecStart=/opt/prometheus/prometheus --config.file=/opt/prometheus/prometheus.yml
[Install]
WantedBy=multi-user.target

Note: This configuration will run the Prometheus server as root. In a production environment, it is highly recommended to run it as another user (e.g. ‘prometheus’)

  • If you are using a firewall, add the following rule to /etc/sysconfig/iptables and restart service iptables:

-A INPUT -p tcp -m state --state NEW -m tcp --dport 9090 -j ACCEPT

Viewing a Metric

You should the data of http requests served by the Prometheus server itself. If you click on the tab graph, you can see the data as a graph.

Installing Node Exporter on the Server to be Monitored

tar xvfz node_exporter-*.tar.gz
  • Create link (replace 0.15.2 with the version you have downloaded)

ln -s node_exporter-0.15.2.linux-amd64 node_exporter

  • Define a service for node_exporter in /etc/systemd/system/node_exporter.service

(or if you are using init.d, please see this article).


[Unit]
Description=Node Exporter

[Service]
User=root
ExecStart=/opt/node_exporter/node_exporter --no-collector.diskstats

[Install]
WantedBy=default.target

Note: This configuration will run the Grafana server as root. In a production environment, it is highly recommended to run it as another user (e.g. ‘prometheus’)

(The –no-collector.diskstats is added above since diskstats often does not work in virtualized environments. If that is not an issue for you, be free to leave this argument out.)

  • Enable and start service

systemctl daemon-reload
systemctl start node_exporter
systemctl enable node_exporter

  • Tell Prometheus to scrape these metrics by adding the following to /opt/prometheus/prometheus.yml
 - job_name: "node"
static_configs:
- targets: ['localhost:9100']

Now if you got to yourserver.com:9090/graph you can for instance enter the expression node_memory_MemFree and see the free memory available on the server.

You can also install node_exporter on another server. Simply point the job definition then to this servers address; and of course remember to open port 9100 on the server.

Installing Grafana

The default Prometheus interface is quite basic. Thankfully Grafana offers excellent integration with Prometheus and will result in a much nicer UI.

You can easily install Grafana on your own server or use a free cloud-based instance (limited to one user and five dashboards).

To install Grafana locally:

  • First follow these instructions.
  • Graphana by default runs on port 3000, so make sure you add the following firewall rule after you install it:

-A INPUT -p tcp -m state --state NEW -m tcp --dport 3000 -j ACCEPT

  • In the file /etc/grafana/grafana.ini, provide details for an SMTP connection which can be used for sending emails (section [smtp]).
  • Also update the host name in the field domain to the address at which your server can be reached on the internet.

Configuring Grafana

  • Go to yourserver.com:3000
  • The default login is username ‘admin’ and password ‘admin’. Create a new user with a good password and delete the admin user.
  • First connect with your Prometheus instance as a data source.
  • Then go to Dashboards and select import

dashboards

Done! You should now be able to see the metrics for your server such as CPU usage or free memory.

dashboard

If you monitor multiple servers, you can switch between them by clicking next to the text ‘node’.

switch

Additional servers will appear here if you add them to the Prometheus configuration:

- job_name: "node"
 static_configs:
 - targets: ['localhost:9100', 'xxxxx']

Configure Alerting

While Prometheus has some build in alerting facilities, alerting in Grafana is much easier to use. To set up altering for the dashboard you have created:

  • Go to Alerting / Notification Channels
  • Click on New Channel
  • Provide a name for the channel and your email address and click Save.

channel

  • Next go to the dashboard you have created: Node Exporter Server Metrics
  • Click on the first panel to select it

panel

  • Next click on Edit in the menu which is shown above the panel
  • Go to the Alert tab and click on Create Alert

create alert

  • Configure the following condition(for more details about this, please see this page):

condition

  • Select the Notification page on the right
  • In Send to, select the notification channel you have created earlier.
  • Provide a message such as: CPU usage high.
  • Save the dashboard (Ctrl+S)

Done! You should now receive notifications if the CPU usage on any of the servers monitored on this dashboard is too high.

Further Reading

Setting Up Graylog Server

I have been looking around for an easy to use and reasonable priced solution for managing logs distributed among many servers and system metrics for these servers. I had a brief look into setting up an ELK system but I found that looked quite cumbersome. Recently I came across Graylog and I found it looked quite promising. I thus set up a little sample system.

While the documentation for Graylog is generally quite good, I found it a bit difficult to piece together the various steps in setting up a minimal working system. Thus I have documented these steps below!

Installing Graylog and Dependencies

Just follow the excellent CentOS installation instructions from the Graylog documetation.

Make sure to provide details for sending emails under the header # Email transport.

If you are using a firewall, open ports 9000 for TCP and 51400 for UPD. For instance, by assuring the following lines are in /etc/sysconfig/iptables.


-A INPUT -p tcp -m state --state NEW -m tcp --dport 9000 -j ACCEPT
-A INPUT -p udp -m state --state NEW -m udp --dport 51400 -j ACCEPT

Don’t forget to restart the iptables service: sudo systemctl restart iptables.

Collecting the Logs from Another CentOS System

  • Install rsyslog on the system

sudo yum install rsyslog

  • Enable and start rsyslog service (also see this guide)

sudo systemctl enable rsyslog

sudo systemctl start rsyslog

  • Edit the file /etc/rsyslog.conf and put the following line at the end, into the section marked as # ### begin forwarding rule ### (replace yourserver.com with your graylog server address.

*.* @yourserver.com:51400;RSYSLOG_SyslogProtocol23Format

  • Restart rsyslog

sudo systemctl restart rsyslog

The rsyslog log messages should now be getting send to your server. Give it a few minutes if you don’t see the messages in graylog immediately. Otherwise, check the system log for any errors (sudo cat /var/log/messages).

Also, you can test the connection by entering the following on the monitored system:


nc -u yourserver.com 51400
Hi

This should result in the message Hi being received by graylog.

Analysing Logs

The next steps are quite easy to to since they can be done in the excellent graylog user interface.

critical errors

  • Create an alert. Trigger it when there is ‘more than 0’ messages in the stream you have just created.

Done! You are now collecting logs from a server and you will receive an email notification whenever there is a serious issue reported on the server!

Git Versioning: Beyond Revisr

Some time ago, I have written an article about how to set up versioning for WordPress using git. I now came across an article by Newt Labs which have created an infographic about the justification for using git for WordPress. I think this infographic is quite insightful, so I will provide it below (with the kind permission of Newt Labs).

One thing which was of particular interest to me was that there was a handy list of alternatives to Revisr in the graphic. The alternatives are the following:

Personally I have only used Revisr, which worked fine for me. However, I am a bit concerned by their website greeting visitors with a warning about an expired SSL certificate. This should really have been fixed by now and is not building my trust in the tool.

Why-Git-Is-Really-Important-For-Your-WordPress-Site

Determine Which JDK Version a JAR/Class File Was Compiled With

Today I came across a nasty error which occurred in a deployed Java application only but not during development or integration tests. The error went something like the following:

java.lang.NoSuchMethodError: java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;
at nx.serializerkryo.internal.InternalKryoSerialzer.performToStream(InternalKryoSerialzer.java:33)
at nx.serializerkryo.internal.InternalKryoSerialzer.serialize(InternalKryoSerialzer.java:63)
at nx.serializerkryo.internal.InternalKryoSerialzer.serialize(InternalKryoSerialzer.java:21)
at nx.persistence.jre.internal.OptimizedPersistedNodeSerializer.serialize(OptimizedPersistedNodeSerializer.java:47)
at nx.persistence.jre.internal.OptimizedPe<span 				data-mce-type="bookmark" 				id="mce_SELREST_end" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span>rsistedNodeSerializer.serialize(OptimizedPersistedNodeSerializer.java:21)

Now I had a feeling that this had something to do with me trying to be ahead of the curve and use a Java 9 JDK to compile the application. In order to debug this, I had to confirm which with JDK the classes I was using were compiled with. Thankfully I found a handy thread of StackOverflow.

Unfortunately, it wasn’t immediately obvious to me which solution listed there would work best, so I decided to provide the solution here in a more condensed form. Simply use the following command:

javap -v [path to your class file]

The output will then contain the following line (towards the top of the file):

public class ...
minor version: 0
major version: 50
flags: ACC_PUBLIC, ACC_SUPER

The major version and minor version indicates which version of Java the class was compiled with. The following contains a list of which Java versions which major versions relate to.

Java SE 9 = 53 (0x35 hex),
Java SE 8 = 52 (0x34 hex),
Java SE 7 = 51 (0x33 hex),
Java SE 6.0 = 50 (0x32 hex),
Java SE 5.0 = 49 (0x31 hex),
JDK 1.4 = 48 (0x30 hex),
JDK 1.3 = 47 (0x2F hex),
JDK 1.2 = 46 (0x2E hex),
JDK 1.1 = 45 (0x2D hex).

Interestingly my files were apparently compiled for Java 6 (Maven compiler plugin was responsible). The problem was that the files were compiled with JDK 9 (though they were compiled for 1.6). Downgrading the JDK used to do the compilation to JDK8 fixed the problem.

A Handy Reference of Maven Parameters

I cannot count the times I have looked up the following through Google. Thus I decided to put together a few handy parameters (or properties or whatever is the correct term) for Maven builds.

All the below are given with the goal install but they can safely be used with any other Maven goal as well.

Skip Tests

mvn install -DskipTests

Build Only From Specified Project

This is only relevant in a multi pom.

mvn install -rf :[artifactId]

Don’t Compile JavaDoc

-Dmaven.javadoc.skip=true

Don’t Compile GWT

-Dgwt.compiler.skip=true

 

Good Web Design: PayPal Developer Documentation

I find that developer documentation is often not very pleasant to look at and, more importantly, often very difficult to navigate. I worked briefly with the PayPal REST API and, while I found that at times it can be confusing to deal with the numerous APIs PayPal offers, aesthetically their developer documentation is clear and effective.

What I Like

Clear Overall Design

paypal developer documentation

Overall the documentation looks nice and clear. I like the fonts and colours used. The multi-level menu on the left fits in well and provides good means of navigation without feeling overwhelming.

Good Instructions and Code Examples

paypal developer documentation 2

The subheadings are easy to spot and the step-by-step instructions weave code into them quite nicely. Code examples stand out due to the different background colour.

Clear and Informative Footer

Paypal developer documentation 3

The footer for the page fits nicely into the overall design and gives access to a wide range of resources.

 

Installing Jenkins on Centos 7

I set up a Jenkins server on a brand new Centos 7 VPS. In the following the instructions for doing this in case you are looking at doing the same:

Setting up Jenkins Server

sudo yum install java-1.8.0-openjdk
sudo wget -O /etc/yum.repos.d/jenkins.repo http://pkg.jenkins-ci.org/redhat/jenkins.repo
sudo rpm --import https://jenkins-ci.org/redhat/jenkins-ci.org.key
sudo yum install jenkins

Or for stable version (link did not work for me when I tried it)

sudo wget -O /etc/yum.repos.d/jenkins.repo https://pkg.jenkins.io/redhat-stable/jenkins.repo
sudo rpm --import https://pkg.jenkins.io/redhat-stable/jenkins.io.key
yum install jenkins
  • Start Jenkins server
sudo systemctl start jenkins

You should now be able to access Jenkins at yourserver.com:8080 (if not, see troubleshooting steps at the bottom).

If you want to access your server more securely on port 80, you can do so by installing ngnix as outlined in this article in step 4: How to Install Jenkins on CentOS 7.

Connecting to a Git Repo

You will probably want to connect to a git repository next. This is also somewhat dependent on the operating system you use, so I provide the steps to do this on CentOS as well:

  • Install git
sudo yum install git
  • Generate an SSH key on the server
ssh-keygen -t rsa
  • When prompted, save the SSH key under the following path (I got this idea from reading the comments here)
/var/lib/jenkins/.ssh
  • Assure that the .ssh directory is owned by the Jenkins user:
sudo chown -R jenkins:jenkins /var/lib/jenkins/.ssh
  • Copy the public generated key to your git server (or add it in the GitHub/BitBucket web interface)
  • Assure your git server is listed in the known_hosts file. In my case, since I am using BitBucket my /var/lib/jenkins/.ssh/known_hosts file contains something like the following
bitbucket.org,104.192.143.3 ssh-rsa [...]
  • You can now create a new project and use Git as the SCM. You don’t need to provide any git credentials. Jenkins pulls these automatically form the /var/lib/jenkins/.ssh directory. There are good instructions for this available here.

Connecting to GitHub

  • In the Jenkins web interface, click on Credentials and then select the Jenkins Global credentials. Add a credential for GitHub which includes your GitHub username and password.
  • In the Jenkins web interface, click on Manage Jenkins and then on Configure System. Then scroll down to GitHub and then under GitHub servers click the Advanced Button. Then click the button Manage additional GitHub actions.

additional actions

  • In the popup select Convert login and password to token and follow the prompts. This will result in a new credential having been created. Save and reload the page.
  • Now go back to the GitHub servers section and now click to add an additional server. As credential, select the credential which you have just selected.
  • In the Jenkins web interface, click on New Item and then select GitHub organisation and connect it to your user account.

Any of your GitHub projects will be automatically added to Jenkins, if they contain a Jenkinsfile. Here is an example.

Connect with BitBucket

  • First, you will need to install the BitBucket plugin.
  • After it is installed, create a normal git project.
  • Go to the Configuration for this project and select the following option:

BitBucket trigger

  • Log into BitBucket and create a webhook in the settings for your project pointing to your Jenkins server as follows: http://youserver.com/bitbucket-hook/ (note the slash at the end)

Testing a Java Project

Chances are high you would like to run tests against a Java project, in the following, some instructions to get that working:

Troubleshooting

  • If you cannot open the Jenkins web ui, you most likely have a problem with your firewall. Temporarily disable your firewall with: `sudo systemctl stop iptables` and see if it works then.
  • If it does, you might want to check your rules in `/etc/sysconfig/iptables` and assure that port 8080 is open
  • Check the log file at:
sudo cat /var/log/jenkins/jenkins.log

 

Continuous Integration Server Overview

Since I plan to set up a continuous integration server in the near future, I had a quick look around for open source and cloud-based solutions; my main concern was finding something which will work for a small scale project and result in reasonable costs.

Jenkins (Open Source)

The best choice if you are looking for an open source CI server. If you are familiar with Java, setting up and running Jenkins on your own is in all likeliness much cheaper than any cloud-based alternative.

Buildbot (Open Source)

Jenkins looks to be more widely used than Buildbot. However, if you have a Python project, Buildbot might be worth considering.

Travis CI (Cloud)

My top choice for open source projects. For commercial projects, however, the costs seem to be quite high starting with US$69 per month.

Circle CI (Cloud)

They offer one build container for free which seems like a very generous offer to me. I haven’t explored though how powerful this container is and how long builds would take.

AWS CodePipeline and AWS CodeDeploy (Cloud)

The best choice if you are using an AWS environment.

Codeship (Cloud)

They offer 100 builds per month for free which seems to be quite reasonable. However, since builds are triggered automatically this figure can be reached relatively quickly even with smaller projects.

 

Test Latency Between Two Servers (Linux)

Today I was looking for a simple way to test the latency and bandwidth between two Linux servers.

The easiest way, of course, is to just use ping. The ping utility should be available on almost any Linux server and is extremely easy to use. Just login to one of your servers and then execute the following command using the IP address of your second server:

ping x.x.x.x

You can leave this running for a while and when you have seen enough data, just hit Ctrl + C to interrupt the program. This will result in an output such as the following:

PING 168.235.94.7 (168.235.94.7) 56(84) bytes of data.
64 bytes from 168.235.94.7: icmp_seq=1 ttl=64 time=0.180 ms
64 bytes from 168.235.94.7: icmp_seq=2 ttl=64 time=0.150 ms
64 bytes from 168.235.94.7: icmp_seq=3 ttl=64 time=0.148 ms
64 bytes from 168.235.94.7: icmp_seq=4 ttl=64 time=0.150 ms
^C
--- 168.235.94.7 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3000ms
rtt min/avg/max/mdev = 0.148/0.157/0.180/0.013 ms

Important to note here are are the latencies for the individual tests as well as the overall average which are highlighted in bold in the above. This shows us that there is an average latency of 0.157 between the two servers tested.

In order to test the bandwidth and get some more information about latencies, you might also want to install the iperf tool.

Good Web Design: Squarespace

I want to start a new category of articles for this blog where I record web or user experience design I come across and which impresses me. I will start today with the homepage of Squarespace. Since they are a web design company, good design can be expected of course. Some things which I noticed where the following:

Nice Fonts

fonts

For headings, they use an all-caps, spaced out font: Gotham with font size 22px, line height 1.6em and letter spacing of 0.2em. The body text is also Gotham with font size 14px and line height of 1.8em.

Simple but Effective Buttons

buttons

Buttons come either with a white background and black text or the other way around. The font size used is 11px.

Nice and Clean Gallery of Templates

templates

What I didn’t Like

I think the lack of a main menu makes it quite difficult to find the things one is looking for. To find the gallery of templates, one needs to scroll all the way down to the bottom of the page and then select ‘Websites’. Also the pricing information is not easily accessible.