Alright! This is going to be one of the quickest posts ever! Why? Because what we are going to do is ridiculously simple yet powerful!. We will build a 2-node Nomad distributed scheduler to run applications on. Sure, there is Kubernetes, Mesos, etc., but….can you do it in about 10 minutes and with single binaries? Ah!, the elegance of Hashicorp!

What you need

  • 2 VMs and a Consul installed somewhere. I always use RHEL or Ubuntu for my VMs.
  • Consul agent
  • Nomad

Procedure

My nodes are called: nomad0.puppet.xuxo and nomad1.puppet.xuxo. We will make nomad0 our server and nomad1 our client. You can scale up and cluster as much as you want! It is very quick and simple to add.

On each of the nodes, download and place Consul agent and nomad:

wget https://releases.hashicorp.com/consul/0.7.2/consul_0.7.2_linux_amd64.zip
wget https://releases.hashicorp.com/nomad/0.5.2/nomad_0.5.2_linux_amd64.zip
unzip nomad_0.5.2_linux_amd64.zip
unzip consul_0.7.2_linux_amd64.zip
cp nomad /usr/bin/
cp consul /usr/bin/

Create a config file (/etc/consul/config.json) for the Consul agent on each:

{
    "advertise_addr":"192.168.0.195", (<-IP of node, change for each node)
    "bind_addr":"192.168.0.195", (<-IP of node, change for each node)
    "client_addr":"0.0.0.0",
    "data_dir":"/opt/consul",
    "datacenter":"xuxodrome-west", (<- your consul datacenter)
    "node_name":"nomad0", (<-node name, change for each)
    "ui_dir":"/opt/consul/ui"
}

Create a config file (/etc/nomad.d/server.hcl) for nomad0 (our server):

#/etc/nomad.d/server.hcl

# Increase log verbosity
log_level = "DEBUG"

# Setup data dir
data_dir = "/opt/nomad0"

# Enable the server
server {
      enabled = true

# Self-elect
     bootstrap_expect = 1
}

Create a config file (/etc/nomad.d/client.hcl) for nomad1 (our client):

#/etc/nomad.d/client.hcl
datacenter = "xuxodrome-west"

client {
 enabled = true
}

leave_on_terminate = true

Optionally, you can create a systemd file to manage the nomad service:

#/lib/systemd/system/nomad.service
[Unit]
Description=nomad

[Service]
ExecStart=/usr/bin/nomad agent -config /etc/nomad.d
KillSignal=SIGTERM

Alright, we are ready! Let’s start everything and verify:

On server node run:

consul agent -data-dir=/opt/consul -node=nomad0.puppet.xuxo /
-bind=192.168.0.195 -config-dir=/etc/consul &

and

systemctl start nomad (if you did the systemd service file)

On client node run the same commands in the same order but changing the -node option to the client name.

Verify cluster memberships by running a check on any consul member:

root@nomad0:~# consul members
Node Address Status Type Build Protocol DC
consul0 192.168.0.20:8301 alive server 0.7.2 2 xuxodrome-west
nomad0.puppet.xuxo 192.168.0.195:8301 alive client 0.7.2 2 xuxodrome-west
nomad1.puppet.xuxo 192.168.0.196:8301 alive client 0.7.2 2 xuxodrome-west

Verify nomad cluster by running this command on our nomad server (nomad0):

root@nomad0:~# nomad node-status
ID DC Name Class Drain Status
e1be248c xuxodrome-west nomad1.puppet.xuxo <none> false ready

We are done! But what fun is it if our scheduler is not running anything? None. Let’s create a mongo container job then.

On nomad0, our server, create a job file:

#/home/root/mongo.nomad

job "mongo" {
        datacenters = ["xuxodrome-west"]
        type = "service"


        update {
           stagger = "10s" 
           max_parallel = 1
        }
 

        group "cache" {
           count = 1
           restart {
          attempts = 10
           interval = "5m"
           delay = "25s"
           mode = "delay"
        }

         ephemeral_disk {
              size = 300
          }

         task "mongo" {
               driver = "docker"

         config {
               image = "mongo"
               port_map {
                    db = 27017
              }
         }

          resources {
                  cpu = 500 # 500 MHz
                  memory = 256 # 256MB
         network {
                  mbits = 10
                  port "db" {}
           }
    }

            service {
                 name = "global-mongodb-check"
                 tags = ["global", "cache"]
                 port = "db"
                 check {
                      name = "alive"
                      type = "tcp"
                      interval = "10s"
                      timeout = "2s"
             }
       }


    }
  }
}

Start the job:

nomad run mongo.nomad

Verify job after some seconds:

nomad status mongo

ID = mongo
Name = mongo
Type = service
Priority = 50
Datacenters = xuxodrome-west
Status = running
Periodic = false

Summary
Task Group Queued Starting Running Failed Complete Lost
cache 0 0 1 0 0 0

Allocations
ID Eval ID Node ID Task Group Desired Status Created At
230d2ec2 ff77cbff e1be248c cache run running 12/27/16 21:20:32 UTC

Check Consul and now our nomad cluster is alive and the mongo service is available:

mongoscreen.png

Have fun!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s