Lambda Go Starter Project

Serverless development allows deploying low-cost, low-maintenance applications. Go is an ever-more popular language for developing backend applications. With first rate support both in AWS Lambda and GCP Cloud Functions, Go is an excellent choice for serverless development.

Unfortunately, setting up a flexible infrastructure and efficient deployment pipelines for Serverless applications is not always easy. Especially if we are interested in not just deploying a function but exposing it through HTTP for other services to use. This post describes the key elements of a little starter project for deploying a Go Lambda that exposes a simple, extensible REST API.

This project uses the following technologies:

  • Go 1.16
  • AWS Lambda
  • AWS API Gateway
  • Terraform for defining all infrastructure
  • Goldstack Template Framework for deployment orchestration (note this is based on Node.js/Yarn)

The source code for the project is available on GitHub: go-lambda-starter-project

The live API is deployed here: API root

To quickly create a customised starter project like this one: Go Gin Lambda Template on Goldstack

Go Project Setup

To setup a new Go project is very straightforward and can be achieved using the go mod init command. This results in the creation of a go.mod file. This file helps manage all dependencies for the project. For our project, we have also added a number of dependencies that are viewable in this file:

  • aws-lambda-go: For a low-level API for interacting with AWS Lambda
  • gin: Gin is used as the framework for building our HTTP server
  • aws-lambda-go-api-proxy: For linking AWS lambda with our HTTP framework Gin
  • gin-contrib/cors: For providing our Gin server with CORS configuration

This is the resulting go.mod file for the project:

module goldstack.party/templates/lambda-go-gin
go 1.16
require (
github.com/aws/aws-lambda-go v1.23.0
github.com/awslabs/aws-lambda-go-api-proxy v0.9.0
github.com/gin-contrib/cors v1.3.1
github.com/gin-gonic/gin v1.6.3
)
view raw go.mod hosted with ❤ by GitHub

Server Implementation

The HTTP server deployed through the Lambda is defined in a couple of Go files.

main.go

main.go is the file that is run when the Lambda is invoked. It also supports being invoked locally for easy testing of the Lambda. When run locally, it will start a server equivalent to the Lambda on localhost.

package main
import (
"os"
"github.com/aws/aws-lambda-go/lambda"
)
func main() {
// when no 'PORT' environment variable defined, process lambda
if os.Getenv("PORT") == "" {
lambda.Start(Handler)
return
}
// otherwise start a local server
StartLocal()
}
view raw main.go hosted with ❤ by GitHub

server.go

The server.go file is where the HTTP server and its routes are defined. In this example, it just provides one endpoint /status that will return a hard-coded JSON. This file also configures the CORS config, in case we want to call the API from a frontend application hosted on a different domain.

package main
import (
"os"
"github.com/gin-contrib/cors"
"github.com/gin-gonic/gin"
)
func CreateServer() *gin.Engine {
r := gin.Default()
corsEnv := os.Getenv("CORS")
if corsEnv != "" {
config := cors.DefaultConfig()
config.AllowOrigins = []string{corsEnv}
r.Use(cors.New(config))
}
r.GET("/status", func(c *gin.Context) {
c.JSON(200, gin.H{
"status": "ok",
})
})
return r
}
view raw server.go hosted with ❤ by GitHub

Infrastructure

The infrastructure for this starter project is defined using Terraform. There are a couple of things we need to configure to get this project running, these are:

  • Route 53 Mappings for the domain we want to deploy the API to as well as a SSL certificate for being able to call the API via HTTPS: domain.tf
  • An API Gateway for exposing our Lambda through a public endpoint: api_gateway.tf
  • The definition of the Lambda function that we will deploy our code into: lambda.tf

The details of the infrastructure are configured in a config file: goldstack.json.

{
"$schema": "./schemas/package.schema.json",
"name": "lambda-go-gin",
"template": "lambda-go-gin",
"templateVersion": "0.1.1",
"configuration": {},
"deployments": [
{
"name": "dev",
"configuration": {
"lambdaName": "go-gin-starter-project",
"apiDomain": "go-gin-starter-project.dev.goldstack.party",
"hostedZoneDomain": "dev.goldstack.party"
},
"awsUser": "awsUser",
"awsRegion": "us-west-2"
}
]
}
view raw goldstack.json hosted with ❤ by GitHub

The configuration options are documented in the Goldstack documentation as well as in the project readme. Note that these configuration options can be created using the Goldstack project builder or manually in the JSON file.

The infrastructure can easily be stood up by using a Yarn script:

yarn
cd packages/lambda-go-gin
yarn infra up dev

dev denotes here for which environment we want to stand up the infrastructure. Currently the project only contains one environment, dev, but it is easy to define (and stand up) other ones by defining them in the goldstack.json file quoted above.

Note it may seem unconventional here to use a Yarn script, and ultimately an npm module, to deploy the infrastructure for our Go lambda. However, using a scripting language to support the build and deployment of a compiled language is nothing unusual, and using Yarn allows us to use one language/framework for managing more complex projects that also involve frontend modules defined in React.

Deployment

Deployment like standing up the infrastructure can easily be achieved with a Yarn script referencing the environment we want to deploy to:

yarn deploy dev

This will build our Go project, package and zip it as well as upload it to AWS. The credentials for the upload need to be defined in a file config/infra/aws/config.json, with contents such as the following:

{
"users": [
{
"name": "awsUser",
"type": "apiKey",
"config": {
"awsAccessKeyId": "[Access Key Id]",
"awsSecretAccessKey": "[Secret Access Key]",
"awsDefaultRegion": "us-west-2"
}
}
]
}
view raw config.json hosted with ❤ by GitHub

A guide how to obtain these credentials is available on the Goldstack documentation. It is also possible to provide these credentials in environment variables, which can be useful for CI/CD.

Development

To adapt this starter project for your requirements, you will need to do the following:

  • Provide a config file with AWS credentials (or environment variables with the same)
  • Update packages/lambda-go-gin/goldstack.json with the infrastructure definitions for your project
  • Initialise the Yarn project with yarn
  • Deploy infrastructure using cd packages/lambda-go-gin; yarn infra up dev
  • Deploy the lambda using cd packages/lambda-go-gin; yarn deploy dev
  • Start developing your server code in packages/lambda-go-gin/server.go

This project also contains a script for local development. Simply run it with

cd packages/lambda-go-gin
yarn watch

This will spin up a local Gin server on http://localhost:8084 that will provide the same API that is exposed via API gateway and makes for easy local testing of the Lambda. Also deployment of the Lambda should only take a few seconds and can be triggered anytime using yarn deploy dev.

If you want to skip the steps listed above to configure the project, you can generate a customised project with the Goldstack project builder.

This template is just a very basic Go project to be deployed to Lambda and actually my first foray into Go development. I haven’t based any larger projects on this yet to test it out, so any feedback to improve the template is welcome. I will also be keep on updating the template on Goldstack with any future learnings.

Deploy Java Lambda using SAM and Buildkite

I’ve recently covered how to deploy a Node JS based Lambda using SAM and Buildkite. I would say that this should cover most use cases, since I believe a majority of AWS Lambdas are implemented with JavaScript.

However, Lambda supports many more programming languages than just JavaScript and one of the more important ones among them is certainly Java. In principle, deployment for Java and JavaScript is very similar: we provide a SAM template and an archive of a packaged application. However, Java uses a different toolset than JavaScript, so the build process of the app will be different.

In the following, I will briefly explain how to build and deploy a Java Lambda using Maven, SAM and Buildkite. If you want to get to the code straight away, find a complete sample project here:

https://github.com/mxro/lambda-java-sam-buildkite

First we define a simple Java based Lambda:

package com.amazonaws.handler;

import com.amazonaws.services.lambda.runtime.Context; 
import com.amazonaws.services.lambda.runtime.LambdaLogger;

public class SimpleHandler {
    public String myHandler(int myCount, Context context) {
        LambdaLogger logger = context.getLogger();
        logger.log("received : " + myCount);
        return String.valueOf(myCount);
    }
}

Then add a pom.xml to define the build and dependencies of our Java application:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.amazonaws</groupId>
    <artifactId>lambda-java-sam-buildkite</artifactId>
    <version>1.0.0</version>
    <packaging>jar</packaging>
    <name>Sample project for deploying a Java AWS Lambda function using SAM and Buildkite.</name>

    <properties>
        <maven.compiler.source>1.8</maven.compiler.source>
        <maven.compiler.target>1.8</maven.compiler.target>
        <maven.compiler.plugin.version>3.8.0</maven.compiler.plugin.version>
        <aws.lambda.java.core.version>1.1.0</aws.lambda.java.core.version>
        <junit.version>4.12</junit.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>com.amazonaws</groupId>
            <artifactId>aws-lambda-java-core</artifactId>
            <version>${aws.lambda.java.core.version}</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>${junit.version}</version>
            <scope>test</scope>
        </dependency>    
    </dependencies>

    
    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>${maven.compiler.plugin.version}</version>
                <configuration>
                    <source>${maven.compiler.source}</source>
                    <target>${maven.compiler.target}</target>
                </configuration>
            </plugin>
        </plugins>
    </build>
    
</project>

We define the lambda using a SAM template. Note that we are referencing the JAR that is assembled using Maven target/lambda-java-sam-buildkite-1.0.0.jar.

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
    sam-app

Globals:
    Function:
        Timeout: 20
        Environment: 

Resources:
  SimpleFunction:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: target/lambda-java-sam-buildkite-1.0.0.jar
      Handler: com.amazonaws.handler.SimpleHandler::myHandler
      Runtime: java8

Then we need Dockerfile that will run our build. Here we simply start with an image that has Maven preinstalled and then install Python and the AWS SAM CLI.

FROM zenika/alpine-maven:3-jdk8

# Installing python
RUN apk add --update \
    python \
    python-dev \
    py-pip \
    build-base \
  && pip install virtualenv \
  && rm -rf /var/cache/apk/*

RUN python --version

# Installing AWS CLI and SAM CLI
RUN apk update && \
    apk upgrade && \
    apk add bash && \
    apk add --no-cache --virtual build-deps build-base gcc && \
    pip install awscli && \
    pip install aws-sam-cli && \
    apk del build-deps

RUN mkdir /app
WORKDIR /app
EXPOSE 3001

The following build script will run within this Dockerfile and first package the Java application into a Jar file using mvn package and then uses the SAM CLI to deploy the template and application.

#!/bin/bash -e

mvn package

echo "### SAM Deploy"

sam --version

sam package --template-file template.yaml --s3-bucket sam-buildkite-deployment-test --output-template-file packaged.yml

sam deploy --template-file ./packaged.yml --stack-name sam-buildkite-deployment-test --capabilities CAPABILITY_IAM

Finally we define the Buildkite template. Note that this template assumes the environment variables AWS_DEFAULT_REGION, AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are provided by Buildkite.

steps:
  - name: Build and deploy to AWS
    command:
      - './.buildkite/deploy.sh'
    plugins:
      - docker-compose#v2.1.0:
          run: app
          config: 'docker-compose.yml'
          env:
            - AWS_DEFAULT_REGION
            - AWS_ACCESS_KEY_ID
            - AWS_SECRET_ACCESS_KEY

Now we simply need to create a Buildkite pipeline and link it to a repository with our source code.

Deploy Lambda using SAM and Buildkite

One of the many good things about Lambdas on AWS is that they are quite easy to deploy. Simply speaking, all that one requires is a zip file of an application that then can be uploaded using an API call.

Things unfortunately quickly become more complicated, especially if the Lambda depends on other resources on AWS, as they often do. Thankfully there is a solution for this in the form of the AWS Serverless Application Model (SAM). AWS SAM enables to specify lambdas along with their resources and dependencies in a simple and coherent way.

AWS being AWS, there are plenty of examples of deploying Lambdas defined using SAM using AWS tooling, such as CodePipeline and CodeBuild. In this article, I will show that it is just as easy deploying Lambdas using Buildkite.

For those wanting to skip straight to the code, here the link to the GitHub repo with an example project:

lambda-sam-buildkite

This example uses the Buildkite Docker Compose Plugin that leverages a Dockerfile, which provides the AWS SAM CLI:

FROM python:alpine
# Install awscli and aws-sam-cli
RUN apk update && \
    apk upgrade && \
    apk add bash && \
    apk add --no-cache --virtual build-deps build-base gcc && \
    pip install awscli && \
    pip install aws-sam-cli && \
    apk del build-deps
RUN mkdir /app
WORKDIR /app

The Buildkite pipeline assures the correct environment variables are passed to the Docker container so that the AWS CLI can be authenticated with AWS:

steps:
  - label: SAM deploy
    command: ".buildkite/deploy.sh"
    plugins:
      - docker-compose#v2.1.0:
          run: app
          env:
            - AWS_DEFAULT_REGION
            - AWS_ACCESS_KEY_ID
            - AWS_SECRET_ACCESS_KE

The script that is called in the pipeline simply calls the AWS SAM CLI to package the CloudFormation template and then deploys it:

#!/bin/bash -e

# Create packaged template and upload to S3
sam package --template-file template.yml \ 
            --s3-bucket sam-buildkite-deployment-test \
            --output-template-file packaged.yml

# Apply CloudFormation template
sam deploy --template-file ./packaged.yml \
           --stack-name sam-buildkite-deployment-test \
           --capabilities CAPABILITY_IAM

And that’s it already. This pipeline can easily be extended to deploy to different environments such as development, staging and production and to run unit and integration tests.

Setting up Continuous Deployment with Lerna and Buildkite

Buildkite is a great tool for running multi-step build and deployment pipelines. Lerna is a tool for managing multiple JavaScript packages within one git repository.

Given that both Lerna and Buildkite are quite popular, it is surprising how difficult it is to set up a basic build and deployment with these tools.

It is very easy to configure a deployment pipeline in Buildkite that will run every time a new commit has been made to a git branch. However, using Lerna, we want to build only those packages in a repository that have actually changed, rather than building all packages in the repository.

Lerna provides some build in tooling for this, chiefly the ls command which will provide us a list of all the packages defined in the monorepo. Using the –since filter with this command, we can easily determine all packages that have changed since the last commit as follows:

lerna ls --json --since HEAD^

Where ^HEAD is the git reference to the commit proceeding HEAD. The --json flag provides us with an output that is a bit easier to parse, for instance using jq.

However, in a CD environment, we usually build not only when a new merge to master has occurred but also when changes to a branch have been submitted. In that instance, we are not interested in the changes which occurred since the last commit but all the changes that have been made in the branch in comparison to the current master branch.

In this instance, the lerna ls command with a --since filter can help us when comparing the current branch with master.

lerna ls --json --since refs/heads/master

Internally, Lerna would run a command such as the following:

git --no-pager diff --name-only refs/heads/master..refs/remotes/origin/$BUILDKITE_BRANCH

This diff goes both ways, so when a file is changed in a package in master only or the branch only, it will cause the package to be listed among the packages to be build. We are only interested in those packages however that are changed in the branch. This can be somewhat assured by running a git pull before the lerna ls command:

git pull --no-edit origin master
lerna ls --json --since refs/heads/master

Unfortunately Buildkite by default does something of a ‘lazy clone’ of the repository. It will only ensure that the branch that is currently being built is checked out with the latest commit and other branches, including master, might be cached from previous builds and are on an old commit. This will prevent the above approach for building branches from working. Thankfully there is an environment variable in Buildkite we can use to force it to get the latest commit for all branches: BUILDKITE_CLEAN_CHECKOUT=true.

Having this list of packages that have changed, we can then trigger pipelines specific to building the changed packages. This can be accomplished using a trigger step.

- trigger: "[name of pipeline for package x]"
  label: ":rocket: Trigger: Build for package x"
  async: false
  build:
    message: "${BUILDKITE_MESSAGE}"
    commit: "${BUILDKITE_COMMIT}"
    branch: "${BUILDKITE_BRANCH}"

Another way to go about deploying with Lerna might be using lerna publish where we could push npm packages to an npm registry and then trigger builds from there. I haven’t tested this way and I think this would require a private npm registry, which the way outlined in this article would not.

If anyone has a more elegant way to go about orchestrating the builds of packages in a Lerna repository, please let everyone know in the comments.

Configuring an initd Service for node_exporter

I recently wrote an article showing how to configure Prometheus and Grafana for easy metrics collection. In that article, I assumed that the system which should be monitored would use the systemd approach for defining services.

I now had to set up the node_exporter utility on a system which uses the initd approach. Thus, I provide some simple instructions here on how to accomplish that.


wget https://github.com/prometheus/node_exporter/releases/download/v0.15.2/node_exporter-0.15.2.linux-amd64.tar.gz

  • Extract the archive

tar xvfz node_exporter-*.tar.gz

  • Create a link

ln -s node_exporter-* node_exporter

  • Create the file /opt/node_exporter/node_exporter.sh and add the following content:

#!/bin/sh

/opt/node_exporter/node_exporter --no-collector.diskstats


#!/bin/sh
### BEGIN INIT INFO
# Provides: node_exporter
# Required-Start: $local_fs $network $named $time $syslog
# Required-Stop: $local_fs $network $named $time $syslog
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Description:
### END INIT INFO

SCRIPT=/opt/node_exporter/node_exporter.sh
RUNAS=root

PIDFILE=/var/run/node_exporter.pid
LOGFILE=/var/log/node_exporter.log

start() {
if [ -f "$PIDFILE" ] && kill -0 $(cat "$PIDFILE"); then
echo 'Service already running' >&2
return 1
fi
echo 'Starting service…' >&2
local CMD="$SCRIPT &> \"$LOGFILE\" && echo \$! > $PIDFILE"
su -c "$CMD" $RUNAS > "$LOGFILE"
echo 'Service started' >&2
}

stop() {
if [ ! -f "$PIDFILE" ] || ! kill -0 $(cat "$PIDFILE"); then
echo 'Service not running' >&2
return 1
fi
echo 'Stopping service' >&2
kill -15 $(cat "$PIDFILE") && rm -f "$PIDFILE"
echo 'Service stopped' >&2
}

uninstall() {
echo -n "Are you really sure you want to uninstall this service? That cannot be undone. [yes|No] "
local SURE
read SURE
if [ "$SURE" = "yes" ]; then
stop
rm -f "$PIDFILE"
echo "Notice: log file is not be removed: '$LOGFILE'" >&2
update-rc.d -f  remove
rm -fv "$0"
fi
}

case "$1" in
start)
start
;;
stop)
stop
;;
uninstall)
uninstall
;;
retart)
stop
start
;;
*)
echo "Usage: $0 {start|stop|restart|uninstall}"
esac

Note 1: This sample script runs the script as user root. For production environments, it is highly recommended to configure another user (such as ‘prometheus’) which runs the script.

Note 2: Also check out this init.d script made specifically for node_exporter: node.exporter.default by eloo.

  • Make both files executable

chmod +x /etc/init.d/node_exporter

chmod +x <em>/opt/node_exporter/node_exporter.sh</em>

  • Test the script

/etc/init.d/node_exporter start

/etc/init.d/node_exporter stop

  • Enable start with chkconfig

chkconfig --add node_exporter

All done! Now you can configure your Prometheus server to grab the metrics from the node_exporter instance.

Easy VPS Backup

I love VPS providers such as RamNode or ServerCheap which provide excellent performance at a low price point. Unfortunately, when going with most VPS providers, there are no easy built-in facilities for backing up and restoring the data of your servers (such as with AWS EC2 snapshots). Thankfully, there is some powerful, easy to use and open source software available to take care of the backups for us!

In this article, I am going to show how to easily do a backup of your VPS using restic. Another tool you might want to look at is Duplicity, which provides a higher level of security but which is also more difficult to use. (And there are a many, many other alternatives available as well.)

You will need to have access to two servers to follow the following. One server which should be backed up (in the following referred to as Backup Client) and one server which will host your backups (in the following referred to as Backup Server).

Installing Restic (on Backup Client)

  • Get the URL to the binary for you system from the latest restic release.
  • Log into the Backup Client
  • Download the binary using wget

wget https://github.com/restic/restic/releases/download/v0.8.1/restic_0.8.1_linux_amd64.bz2

  • Unzip the binary

bzip2 -dk restic_0.8.1_linux_amd64.bz2

  • Move restic to /opt

sudo mv restic_0.8.1_linux_amd64 /opt/restic

  • Make restic executable

chmod +x /opt/restic

Establishing SSH Connection

  • On the Backup Client generate an SSH private and public key (Confirm location `/root/.ssh/id_rsa` and provide no passphrase)
sudo su - root
ssh-keygen -t rsa -b 4096
  • Get the public key

cat /root/.ssh/id_rsa.pub 
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDG3en ...

  • On the Backup Server, create a new user called backup
  • Copy the public key from the Backup Client to the Backup Server so that Backup Client is authorised to access it via SSH. Just copy the output from above and paste it at the end of the authorized_keys file

sudo vi /home/backup/.ssh/authorized_keys

  • On the Backup Client, test the connection to the Backup Server.

sudo ssh backup@...

Perform Backup (on Backup Client)


/opt/restic -r sftp:backup@[backup-server]:/home/backup/[backup client host name] init

  • Backup the full hard disk (this may take a while!)

/opt/restic --exclude={/dev,/media,/mnt,/proc,/run,/sys,/tmp,/var/tmp} -r sftp:backup@[backup-server]:/home/backup/[backup client host name] backup /

 

Schedule Regular Backups (Backup Client)

  • On the Backup Client, create the file /root/restic_password. Paste your password into this file.
  • Create the script file /root/restic.sh (replace with the details of your servers)

#/bin/bash

/opt/restic -r sftp:backup@[backup-server]:/home/backup/[backup client host name] --password-file=/root/restic_password --exclude={/dev,/media,/mnt,/proc,/run,/sys,/tmp,/var/tmp} backup /
/opt/restic -r sftp:backup@[backup-server]:/home/backup/[backup client host name] --password-file=/root/restic_password forget --keep-daily 7 --keep-weekly 5 --keep-monthly 12 --keep-yearly 75
/opt/restic -r sftp:backup@[backup-server]:/home/backup/[backup client host name] --password-file=/root/restic_password prune
/opt/restic -r sftp:backup@[backup-server]:/home/backup/[backup client host name] --password-file=/root/restic_password check

  • Make script executable

chmod +x /root/restic.sh

  • Trail run this script: /root/restic.sh
  • If everything worked fine, schedule to run this script daily (e.g. with sudo crontab -e) or at whichever schedule you prefer (Note that the script might take 10 min or more to execute, so it is probably not advisable to run this very frequently. If you need more frequent updates, just run the first line of the script ‘backup’ which is faster than the following maintenance operations).

0 22 * * * /root/restic.sh

 

That’s it! All important files from your server will now be backed up regularly.

Setting up Prometheus and Grafana for CentOS / RHEL 7 Monitoring

As mentioned in my previous post, I have long been looking for a centralised solution for collecting logs and monitoring metrics. I think my search was unsuccessful since I was looking for too many things in one solution. Instead I found now two separate solutions, one for log management (using Graylog) and one for metrics (using Prometheus and Grafana). I deployed both of these on very inexpensive VPS machines and so far I am very happy with them.

In this post, I provide some pointers how to set up the metrics solution based on Prometheus and Grafana. I assume you are using a RHEL / CentOS system as the server hosing Prometheus and Grafana and you are interested in the OS metrics for a CentOS system. This tutorial will guide you through setting up the Prometheus server, collecting metrics for it using node_exporter and finally how to create dashboards and alerts using Grafana for the collected metrics.

Installing Prometheus Server

  • Follow the excellent instructions here with the following modifications.
    • Make sure to download the latest version of Prometheus (the link can be obtained on the Prometheus download page, this guide works with version 2.1.0)
    • For the systemd service, use the following file:

[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
User=root
Restart=on-failure
ExecStart=/opt/prometheus/prometheus --config.file=/opt/prometheus/prometheus.yml
[Install]
WantedBy=multi-user.target

Note: This configuration will run the Prometheus server as root. In a production environment, it is highly recommended to run it as another user (e.g. ‘prometheus’)

  • If you are using a firewall, add the following rule to /etc/sysconfig/iptables and restart service iptables:

-A INPUT -p tcp -m state --state NEW -m tcp --dport 9090 -j ACCEPT

Viewing a Metric

You should the data of http requests served by the Prometheus server itself. If you click on the tab graph, you can see the data as a graph.

Installing Node Exporter on the Server to be Monitored

tar xvfz node_exporter-*.tar.gz
  • Create link (replace 0.15.2 with the version you have downloaded)

ln -s node_exporter-0.15.2.linux-amd64 node_exporter

  • Define a service for node_exporter in /etc/systemd/system/node_exporter.service

(or if you are using init.d, please see this article).


[Unit]
Description=Node Exporter

[Service]
User=root
ExecStart=/opt/node_exporter/node_exporter --no-collector.diskstats

[Install]
WantedBy=default.target

Note: This configuration will run the Grafana server as root. In a production environment, it is highly recommended to run it as another user (e.g. ‘prometheus’)

(The –no-collector.diskstats is added above since diskstats often does not work in virtualized environments. If that is not an issue for you, be free to leave this argument out.)

  • Enable and start service

systemctl daemon-reload
systemctl start node_exporter
systemctl enable node_exporter

  • Tell Prometheus to scrape these metrics by adding the following to /opt/prometheus/prometheus.yml
 - job_name: "node"
static_configs:
- targets: ['localhost:9100']

Now if you got to yourserver.com:9090/graph you can for instance enter the expression node_memory_MemFree and see the free memory available on the server.

You can also install node_exporter on another server. Simply point the job definition then to this servers address; and of course remember to open port 9100 on the server.

Installing Grafana

The default Prometheus interface is quite basic. Thankfully Grafana offers excellent integration with Prometheus and will result in a much nicer UI.

You can easily install Grafana on your own server or use a free cloud-based instance (limited to one user and five dashboards).

To install Grafana locally:

  • First follow these instructions.
  • Graphana by default runs on port 3000, so make sure you add the following firewall rule after you install it:

-A INPUT -p tcp -m state --state NEW -m tcp --dport 3000 -j ACCEPT

  • In the file /etc/grafana/grafana.ini, provide details for an SMTP connection which can be used for sending emails (section [smtp]).
  • Also update the host name in the field domain to the address at which your server can be reached on the internet.

Configuring Grafana

  • Go to yourserver.com:3000
  • The default login is username ‘admin’ and password ‘admin’. Create a new user with a good password and delete the admin user.
  • First connect with your Prometheus instance as a data source.
  • Then go to Dashboards and select import

dashboards

Done! You should now be able to see the metrics for your server such as CPU usage or free memory.

dashboard

If you monitor multiple servers, you can switch between them by clicking next to the text ‘node’.

switch

Additional servers will appear here if you add them to the Prometheus configuration:

- job_name: "node"
 static_configs:
 - targets: ['localhost:9100', 'xxxxx']

Configure Alerting

While Prometheus has some build in alerting facilities, alerting in Grafana is much easier to use. To set up altering for the dashboard you have created:

  • Go to Alerting / Notification Channels
  • Click on New Channel
  • Provide a name for the channel and your email address and click Save.

channel

  • Next go to the dashboard you have created: Node Exporter Server Metrics
  • Click on the first panel to select it

panel

  • Next click on Edit in the menu which is shown above the panel
  • Go to the Alert tab and click on Create Alert

create alert

  • Configure the following condition(for more details about this, please see this page):

condition

  • Select the Notification page on the right
  • In Send to, select the notification channel you have created earlier.
  • Provide a message such as: CPU usage high.
  • Save the dashboard (Ctrl+S)

Done! You should now receive notifications if the CPU usage on any of the servers monitored on this dashboard is too high.

Further Reading

Free Cloud-based Log and Metrics Management Solutions

I have been looking around for a while for a cloud-based service which allows collecting logs and metrics and analysing them. I am particularly interested in a solution which can be deployed for free for smaller applications/amounts of data.

Here are some of the solutions I came across:

Loggly

loggly

  • Store and analyse logs and metrics
  • Free plan available: For 200MB/day, 7 day data retention
  • It looks to me like the Free plan does not allow using the Loggly API!

Splunk

splunk

  • One of the first and most popular solutions in the space
  • Store and analyse logs and metrics
  • Free version available; allows storing up to 500MB/day; I don’t think there is any limitation on data retention
  • Note that free version requires to install the Splunk server on your own server

Sematext

sematext

  • Based on popular open source ELK stack
  • Free plan allows monitoring up to 5 hosts, but only comes with 30 min data retention

Logz.io

logz

  • Also based on ELK stack
  • Free plan allows 3GB upload per day; data retention limited to 3 days.

Overall, I am not too happy with these offerings. In particular, the short data retention periods seem to make some of these offerings too limited to be useful.

Maybe the best option here would be to install your own ELK stack or Graylog. Here are some guides for that: