No description
Find a file
2025-12-01 22:31:04 +00:00
.github Dependencies: switch to weekly with cooldown 2025-11-26 11:59:12 +00:00
.images Add discord plugin 2020-10-24 01:01:43 +01:00
api Update grpc, use greboid/dockerbase 2025-06-13 14:14:43 +01:00
cmd Minor reformatting and tidying 2022-01-07 23:01:30 +00:00
config Support for groups with alert limits. 2025-07-23 17:41:09 +01:00
docs Support for groups with alert limits. 2025-07-23 17:41:09 +01:00
internal Minor reformatting and tidying 2022-01-07 23:01:30 +00:00
plugins Fix bad format string in exec plugin 2025-06-13 14:22:35 +01:00
testdata Fix checks not working at all, add tests. 2025-07-23 22:07:14 +01:00
.gitignore Initial work on an API. Closes #30 2020-10-20 03:51:15 +01:00
CHANGELOG.md Support for groups with alert limits. 2025-07-23 17:41:09 +01:00
Dockerfile Bump greboid/dockerbase/nonroot from 1.20250716.0 to 1.20250803.0 2025-08-05 02:00:06 +00:00
go.mod Bump github.com/sebdah/goldie/v2 from 2.5.5 to 2.8.0 2025-12-01 22:19:35 +00:00
go.sum Bump github.com/sebdah/goldie/v2 from 2.5.5 to 2.8.0 2025-12-01 22:19:35 +00:00
group_test.go Support for groups with alert limits. 2025-07-23 17:41:09 +01:00
grpc.go Update grpc, use greboid/dockerbase 2025-06-13 14:14:43 +01:00
LICENCE Add portscan check 2021-07-03 17:47:03 +01:00
plugins.go Allow checks to specify their own timeouts. 2021-07-03 12:59:49 +01:00
plum.go Fix checks not working at all, add tests. 2025-07-23 22:07:14 +01:00
plum_test.go Fix checks not working at all, add tests. 2025-07-23 22:07:14 +01:00
README.adoc Support for groups with alert limits. 2025-07-23 17:41:09 +01:00
tombstone.go Minor tidying and error handling improvements 2021-02-23 01:04:56 +00:00

:toc:
:toc-placement!:

image::.images/banner.png?raw=true[Goplum]

Goplum is an extensible monitoring and alerting daemon designed for
personal infrastructure and small businesses. It can monitor
websites and APIs, and send alerts by a variety of means if they go down.

toc::[]

== Features

**Highly extensible**: Goplum supports plugins written in Go
to define new monitoring rules and alert types, and it has an API
for integration with other services and tools.

**Get alerts anywhere**: Goplum supports a variety of ways to
alert you out-of-the-box:

[width="100%",cols="3",frame="none",grid="none"]
|=====
| image:.images/alerts/discord.png[Discord logo] Discord
| image:.images/alerts/mail.png[Mail icon] E-mail
| image:.images/alerts/msteams.png[Microsoft Teams icon] Microsoft Teams
| image:.images/alerts/phone.png[Phone icon] Phone call (via Twilio)
| image:.images/alerts/pushover.png[Pushover logo] Pushover
| image:.images/alerts/slack.png[Slack logo] Slack
| image:.images/alerts/sms.png[SMS icon] SMS (via Twilio)
| image:.images/alerts/webhook.png[Webhook logo] Webhook
|
|=====

**Lightweight**: Goplum has a small resource footprint, and all
checks are purpose-written in Go. No need to worry about chains
of interdependent scripts being executed.

**Heartbeat monitoring**: Have an offline service or a cron job
that you want to monitor? Have it send a heartbeat to Goplum
periodically and get alerted if it stops.

**Simple to get started**: If you're set up to run services in
containers, you can get Goplum up and running in a couple of minutes.

**Alert storm prevention**: Group related checks together and set
limits on how many alerts can be sent from a group within a time
window, preventing notification overload when multiple services fail.

== Getting started

=== Basic configuration

Goplum works by running a number of _checks_ (which test to see
if a service is working or not), and when they change state running
an _alert_ that notifies you about the problem.

Checks and alerts are both defined in Goplum's config file. A
minimal example looks like this:

[source]
----
check http.get "example.com" { <1>
  url = "https://example.com/" <2>
}

alert twilio.sms "Text Bob" { <3>
  sid = "sid"
  token = "token"
  from = "+01 867 5309"
  to = "+01 867 5309" <4>
}
----
<1> Goplum's configuration consists of "blocks". The contents
    of the blocks are placed within braces (`{}`). This is
    a "check block"; these will likely make up the bulk of your
    configuration.
    * `http.get` is the type of check we want to execute. The
      `http` part indicates it comes from the HTTP plugin, while
      the `get` part is the type of check.
    * All checks (and alerts) have a unique name, in this case
      we've called it "example.com". If a check starts to fail,
      the alert you receive will contain the check name.
<2> Parameters for the check are specified as `key = value`
    pairs within the body of the check. The documentation for
    each check and alert will explain what parameters are available,
    and whether they're required or not.
<3> Like checks, alerts have both a type and a name. Here we're
    using the `sms` alert from the `twilio` plugin, and we've
    named it `Text Bob`.
<4> The `twilio.sms` alert has a number of required parameters
    that define the account you wish to use and the phone numbers
    involved. These are all just given as `key = value` pairs.

This simple example will try to retrieve \https://example.com/
every thirty seconds. If it fails three times in a row, a text
message will be sent using Twilio. Then if it consistently starts
passing again another message will be sent saying it has recovered.
Don't worry - these numbers are all configurable: see the
<<Default Settings>> section.

In this example we used the `http.get` check and the `twilio.sms`
alert. See the <<Available checks and alerts>> section for details
of the other types available by default.

There is a complete link:docs/syntax.adoc[syntax guide] available
in the `docs` folder if you need to look up a specific aspect of
the configuration syntax.

=== Docker

The easiest way to run Goplum is using Docker. Goplum doesn't require
any privileges, settings, or ports exposed to get a basic setup
running. It just needs the configuration file, and optionally a
persistent file it can use to persist data across restarts:

Running it via the command line:

[source, shell script]
----
# Create a configuration file
vi goplum.config

# Make a 'tombstone' file that Goplum's unprivileged user can write
touch goplum.tomb
chown 65532:65532 goplum.tomb

# Start goplum
docker run -d --restart always \
   -v $(PWD)/goplum.conf:/goplum.conf:ro \
   -v $(PWD)/goplum.tomb:/tmp/goplum.tomb \
   ghcr.io/csmith/goplum
----

Or using Docker Compose:

[source,yaml]
----
version: "3.8"

services:
  goplum:
    image: ghcr.io/csmith/goplum
    volumes:
      - ./goplum.conf:/goplum.conf
      - ./goplum.tomb:/tmp/goplum.tomb
    restart: always
----

The `latest` tag points to the latest stable release of Goplum, if
you wish to run the very latest build from this repository you can
use the `dev` tag.

=== Without Docker

While Docker is the easiest way to run Goplum, it's not that hard to run it
directly on a host without containerisation. See the
link:docs/baremetal.adoc[installing without Docker] guide for more information.

== Usage

=== Available checks and alerts

All checks and alerts in Goplum are implemented as plugins. The following are maintained in
this repository and are available by default in the Docker image. Each plugin has its own
documentation, that explains how its checks and alerts need to be configured.

|====
| Plugin | checks | alerts

| link:plugins/discord[discord]
| -
| message

| link:plugins/http[http]
| get, healthcheck
| webhook

| link:plugins/network[network]
| connect, portscan
| -

| link:plugins/heartbeat[heartbeat]
| received
| -

| link:plugins/msteams[msteams]
| -
| message

| link:plugins/pushover[pushover]
| -
| message

| link:plugins/slack[slack]
| -
| message

| link:plugins/smtp[smtp]
| -
| send

| link:plugins/snmp[snmp]
| int, string
| -

| link:plugins/twilio[twilio]
| -
| call, sms

| link:plugins/debug[debug]
| random
| sysout

| link:plugins/exec[exec]
| command
| -
|====

The `docs` folder contains link:docs/example.conf[an example configuration file]
that contains an example of every check and alert fully configured.

=== Settling and thresholds

When Goplum first starts, it is not aware of the current state of your services.
To avoid immediately sending alerts when the state is determined, Goplum waits for
each check to **settle** into a state, and then only alerts when that state
subsequently changes.

Goplum uses **thresholds** to decide how many times a check result must happen in
a row before it's considered settled. By default, this the threshold is two "good"
results or two "failing" results, but this can be changed - see <<Default Settings>>.

For example:

----
 Goplum                    Failing            Recovery
 starts                     Alert               Alert
   ↓                          ↓                   ↓
    ✓ ✓ ✓ ✓ ✓ ✓ ✓ 🗙 ✓ ✓ ✓ 🗙 🗙 🗙 🗙 🗙 ✓ 🗙 ✓ 🗙 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ …
       ↑                      ↑                   ↑
  State settles          State becomes       State becomes
    as "good"              "failing"            "good"
----

=== Default Settings

All checks have a number of additional settings to control how they work. These can be
specified for each check, or changed globally by putting them in the "defaults" section.
If they're not specified then Goplum's built-in defaults will be used.

|===
|Setting |Description |Default

|`interval`
|Length of time between each run of the check.
|`30s`

|`timeout`
|Maximum length of time the check can run for before it's terminated.
|`20s`

|`alerts`
|A list of alert names to trigger when the service changes state.
 Supports '*' as a wildcard.
|`["*"]`

|`groups`
|A list of group names this check belongs to.
|`[]`

|`failing_threshold`
|The number of checks that must fail in a row before a failure alert is raised.
|`2`

|`good_threshold`
|The number of checks that must pass in a row before a recovery alert is raised.
|`2`
|===

For example, to change the `interval` and `timeout` for all checks:

[source,goplum]
----
defaults {
  interval = 2m
  timeout = 30s
}
----

Or to specify a custom timeout and alerts for one check:

[source,goplum]
----
check http.get "get" {
  url = "https://www.example.com/"
  timeout = 60s
  alerts = ["Text Bob"]
}
----

=== Groups and Alert Storm Prevention

When multiple services fail simultaneously (e.g., when a server goes down), 
you might receive dozens of alerts at once. Groups help prevent this alert 
storm by limiting how many alerts can be sent within a time window.

To create a group:

[source,goplum]
----
group "webservices" {
  alert_limit = 3           # Maximum 3 alerts from this group
  alert_window = 10m        # Within a 10 minute window
  
  defaults {
    interval = 30s          # Default settings for checks in this group
    timeout = 10s
  }
}
----

Then add checks to the group:

[source,goplum]
----
check http.get "website" {
  url = "https://example.com/"
  groups = ["webservices"]
}

# ... other checks ...

check http.get "api" {
  url = "https://api.example.com/"
  groups = ["webservices"]
}
----

With this configuration, if all the websites fail when their server crashes,
you'll receive at most 3 alerts in any 10-minute period.

Checks can belong to multiple groups, and groups can have their own default
settings that override the global defaults but can be overridden by individual
check settings.

== Advanced topics

=== Creating new plugins

Goplum is designed to be easily extensible. Plugins must have a main package which contains
a function named "Plum" that returns an implementation of `goplum.Plugin`. They are then
compiled with the `-buildtype=plugin` flag to create a shared library.

The Docker image loads plugins recursively from the `/plugins` directory, allowing you to
mount custom folders if you wish to supply your own plugins.

Note that the Go plugin loader does not work on Windows. For Windows-based development,
the `goplumdev` command hardcodes plugins, skipping the loader.

=== gRPC API

In addition to allowing plugins to define new checks and alerts, GoPlum provides a gRPC
API to enable development of custom tooling and facilitate use cases not supported by
GoPlum itself (e.g. persisting check history indefinitely). The API is currently in
development; more information can be found in the link:docs/api.adoc[API documentation].

=== plumctl command-line tool

Goplum comes with `plumctl`, a command-line interface to inspect the state of Goplum
as well as perform certain operations such as pausing and resuming a check. `plumctl`
uses the <<gRPC API>>. For more information see the
link:docs/plumctl.adoc[plumctl documentation].

== Licence and credits

Goplum is licensed under the MIT licence. A full copy of the licence is available in
the link:LICENCE[LICENCE] file.

Some icons in this README are modifications of the Material Design icons created by Google
and released under the https://www.apache.org/licenses/LICENSE-2.0.html[Apache 2.0 licence].

Goplum makes use of a number of third-party libraries. See the link:go.mod[go.mod] file
for a list of direct dependencies. Users of the docker image will find a copy of the
relevant licence and notice files under the `/notices` directory in the image.