Monitoring Spring Boot with Prometheus and Grafana
Credits : https://ordina-jworks.github.io/monitoring/2020/11/16/monitoring-spring-prometheus-grafana.html
INTRODUCTION
In a distributed landscape where we are working with microservices, serverless applications, or just event-driven architecture as a whole, observability, which comprises monitoring, logging, tracing, and alerting, is an important architectural concern.
There are a few reasons why we want visibility in our highly distributed systems:
- Issues will occur, even when our best employees have built it.
- Distributed systems generate distributed failures, which can be devastating when we are not prepared in advance.
- Reveal mistakes early, which is great for improvement and learning.
- It keeps us accountable.
- Reduce the mean time to resolution (MTTR).
In this blogpost I will explain the core concepts of Prometheus and Grafana.
In the last section I set up a demo project, so you can follow along and implement monitoring in your own applications.
PROMETHEUS
WHAT IS PROMETHEUS?
Prometheus, originally developed by SoundCloud is an open source and community-driven project that graduated from the Cloud Native Computing Foundation. It can aggregate data from almost everything:
- Microservices
- Multiple languages
- Linux servers
- Windows servers
WHY DO WE NEED PROMETHEUS?
In our modern times of microservices, DevOps is becoming more and more complex and therefore needs automation.
We have hundreds of processes running over multiple servers, and they are all interconnected.
If we would not monitor these services then we have no clue about what is happening on hardware level or application level.
There are many things which we want to be notified about, like:
- Errors
- Response latency
- System overload
- Resources
When we are working with so many moving pieces, we want to be able to quickly identify a problem when something goes wrong inside one of our services.
If we wouldn’t monitor, it could be very time-consuming, since we have no idea where to look.
AN EXAMPLE OF A FAILING SERVICE
Imagine that one server ran out of memory and therefore knocked off a running service container, which syncs two databases.
One of those databases gets used by the authentication service, which now also stops working, because the database is unavailable.
How do you know what went wrong, when your application that depends on the authentication service, now can’t authenticate users anymore?
The only thing we would see is an error message: ERROR: Authentication failed
.
We would need to work backwards over every service, all the way back to the stopped container, to find out what is causing the problem.
A better way would be to have a tool which:
- Constantly monitors all services
- Alerts system admins when something crashes
- Identifies problems before they occur
Prometheus is exactly that tool, it can identify memory usage, CPU usage, available disk space, etc.
We can predefine certain thresholds about which we want to get notified.
In our example it could have been that the memory of our failing server would have reached 70% memory usage for more than one hour, and could’ve sent an alert to our admins before the crash happened.
HOW IT WORKS
PROMETHEUS SERVER
The server does the actual monitoring work, and it consists of three main parts:
- Storage, which is a time series database.
- Data retrieval worker, which is pulling the data from our target services.
- Webserver, which accepts PromQL queries to get data from our DB.
Even though Prometheus has its own UI to show graphs and metrics, we will be using Grafana as an extra layer on top of this webserver, to query and visualize our database.
PROMETHEUS TARGETS
WHAT DOES IT MONITOR?
Prometheus monitors nearly anything. It could be a Linux/windows server, Apache server, single applications, services, etc.
It monitors units on those targets like:
- CPU usage
- Memory/ Disk usage
- Request count
- Request durations
- Exceptions count
The units that we monitor are called metrics, which get saved into the Prometheus time-series database.
Prometheus’ metrics are formatted like a human-readable text file.
In this file we can see that there is a “HELP” comment which describes what the metric is, and we have a “TYPE” which can be one of four metric-types:
- Counter: how many times X happened (exceptions)
- Gauge: what is the current value of X now ? (disk usage, cpu etc)
- Histogram: how long or how big?
- Summary: similar to histogram it monitors request durations and response sizes
COLLECTING METRICS FROM TARGETS
There are basically two ways of ingesting metrics into a monitoring system.
We can either push the data from our clients to our monitoring system, or we pull the data from the monitoring system.
Prometheus is a service which polls a set of configured targets to intermittently fetch their metric values.
In Prometheus terminology, this polling is called scraping.
There is no clear-cut answer about which one is the best, they both have their pros and cons, but some big disadvantages for pushing data are:
- possibility of flooding the network.
- risk of package loss.
The data which gets exposed on the endpoint needs to be in the correct format, one which Prometheus can understand.
As stated before, Prometheus can monitor a lot of different things, servers, services, databases, etc.
Some servers even have a metrics endpoint enabled by default, so for those we don’t have to change anything.
For the ones who don’t have an endpoint enabled by default, we need an exporter.
EXPORTERS
There are a number of libraries and servers which help in exporting existing metrics from third-party systems as Prometheus metrics. You can have a look at the exporters and integration tools here.
On a side note, these tools are also available as Docker images, so we can use them inside Kubernetes clusters.
We can run an exporter docker image for a MySQL database as a side container inside the MySQL pod, connect to it and start translating data, to expose it on the metrics endpoint.
MONITORING OUR OWN APPLICATION
If we want to add our own instrumentation to our code, to know how many server resources our own application is using, how many requests it is handling or how many exceptions occurred, then we need to use one of the client libraries. These libraries will enable us to declare all the metrics we deem important in our application, and expose them on the metrics endpoint.
MICROMETER
To monitor our Spring Boot application we will be using an exporter named Micrometer.
Micrometer is an open-source project and provides a metric facade that exposes metric data in a vendor-neutral format which Prometheus can ingest.
Micrometer provides a simple facade over the instrumentation clients for the most popular monitoring systems, allowing you to instrument your JVM-based application code without vendor lock-in. Think SLF4J, but for metrics.
Micrometer is not part of the Spring ecosystem and needs to be added as a dependency. In our demo application we will add this to our pom.xml
file. For a deeper understanding, check out our blog post about Micrometer.
CONFIGURING PROMETHEUS
To instruct Prometheus on what it needs to scrape, we create a prometheus.yml configuration file.
In this configuration file we declare a few things:
- global configs, like how often it will scrape its targets.
- we can declare rule files, so when we meet a certain condition, we get an alert.
- which services it needs to monitor.
In this example you can see that Prometheus will monitor two things:
- Our Spring Boot application
- Its own health
Prometheus expects the data of our targets to be exposed on the /metrics
endpoint, unless otherwise declared in the metrics_path
field.
ALERTS
With Prometheus, we have the possibility to get notified when metrics have reached a certain point, which we can declare in the .rules
files. Prometheus has a component which is called the “Alertmanager”, and it can send notifications over various channels like emails, Slack, PagerDuty, etc.
QUERYING OUR DATA
Since Prometheus saves all our data in a time series database, which is located on disk in a custom timeseries format, we need to use PromQL query language, if we want to query this database.
We can do this via the Prometheus WebUI, or we can use some more powerful visualization tools like Grafana.
GRAFANA
WHAT IS GRAFANA
Grafana is an open-source metric analytics & visualization application.
- It is used for visualizing time series data for infrastructure and application analytics.
- It is also a web application which can be deployed anywhere users want.
- It can target a data source from Prometheus and use its customizable panels to give users powerful visualization of the data from any infrastructure under management.
WHY GRAFANA
One of the significant advantages of Grafana are its customization possibilities.
It’s effortless to customize the visualization for vast amounts of data.
We can choose a linear graph, a single number panel, a gauge, a table, or a heatmap to display our data.
We can also sort all our data with various labels so data with different labels will go to different panels.
Last but not least, there are a ton of premade dashboard-templates ready to be imported, so we don’t have to create everything manually.
DEMO PROJECT
SETUP SPRING BOOT
To demonstrate how to implement Prometheus and Grafana in your own projects, I will go through the steps to set up a basic Spring Boot application which we monitor by using Docker images of Prometheus and Grafana.
- Set up a regular Spring Boot application by using Spring Initializr.
- Add dependency for Actuator
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency>
- Add dependency for Micrometer
<dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-registry-prometheus</artifactId> <version>1.5.5</version> </dependency>
- Expose our needed Prometheus endpoint in the application.properties file
management.endpoints.web.exposure.include=prometheus management.endpoint.health.show-details=always management.metrics.tags.application= MonitoringSpringDemoProject
- After this we can run the application and browse to
localhost:8080/actuator
, where we can see all the available endpoints. The one we need and will use to monitor this application, islocalhost:8080/actuator/prometheus
.
ADDING OUR OWN CUSTOM METRICS
We can also define some custom metrics, which I will briefly demonstrate in this section.
To be able to monitor custom metrics we need to import MeterRegistry
from the Micrometer library and inject it into our class. This gives us the possibility to use counters, gauges, timers and more.
To demonstrate how we can use this, I added two classes in our basic Spring application.
DemoMetrics has a custom Counter and Gauge, which will get updated every second through our DemoMetricsScheduler class.
The counter gets incremented by one, and the gauge will get a random number between 1 and 100.
DEMOMETRICS CLASS
@Component public class DemoMetrics { private final Counter demoCounter; private final AtomicInteger demoGauge; public DemoMetrics(MeterRegistry meterRegistry) { this.demoCounter = meterRegistry.counter("demo_counter"); this.demoGauge = meterRegistry.gauge("demo_gauge", new AtomicInteger(0)); } public void getRandomMetricsData() { demoGauge.set(getRandomNumberInRange(0, 100)); demoCounter.increment(); } private static int getRandomNumberInRange(int min, int max) { if (min >= max) { throw new IllegalArgumentException("max must be greater than min"); } Random r = new Random(); return r.nextInt((max - min) + 1) + min; } }
DEMOMETRICSSCHEDULER CLASS
@Component public class DemoMetricsScheduler { private final DemoMetrics demoMetrics; public DemoMetricsScheduler(DemoMetrics demoMetrics) { this.demoMetrics = demoMetrics; } @Scheduled(fixedRate = 1000) public void triggerCustomMetrics() { demoMetrics.getRandomMetricsData(); } }
Now we are able to see our custom metrics on the /actuator/prometheus
endpoint, as you can see below.
SETUP PROMETHEUS
The easiest way to run Prometheus is via a Docker image which we can get by running:
docker pull prom/prometheus
After we download the image, we need to configure our prometheus.yml
file. Since I want to demonstrate how to monitor a Spring Boot application, as well as Prometheus itself, it should look like this:
global: scrape_interval: 15s scrape_configs: - job_name: 'prometheus' scrape_interval: 5s static_configs: - targets: ['localhost:9090'] - job_name: 'spring-actuator' metrics_path: '/actuator/prometheus' scrape_interval: 5s static_configs: - targets: ['192.168.0.9:8080']
We define two targets which it needs to monitor, our Spring application and Prometheus.
Since we run Prometheus from inside Docker we need to enter the host-ip which is in my case 192.168.0.9
.
Afterwards we can run the Prometheus image by running the following command:
docker run -d -p 9090:9090 -v <PATH_TO_prometheus.yml_FILE>:/etc/prometheus/prometheus.yml prom/prometheus
We mount the prometheus.yml
config file into the Prometheus image and expose port 9090, to the outside of Docker.
When this is up and running we can access the Prometheus webUI on localhost:9090
.
When we navigate to Status > Targets, we can check if our connections are up and are correctly configured.
Yet again, we can check our custom metrics in the Prometheus UI, by selecting the demo_gauge
and inspecting our graph.
SETUP GRAFANA
To run Grafana we will use the same approach as with Prometheus.
We download and run the image from Docker Hub.
docker run -d -p 3000:3000 grafana/grafana
Now we can access the Grafana UI from localhost:3000
, where you can enter “admin” as login and password.
After we arrive at the landing page, we need to set up a data source for Grafana.
Navigate to Configuration > Data Sources, add a Prometheus data source and configure it like the example below.
For this example I used one of the premade dashboards which you can find on the Grafana Dashboards page.
The dashboard I used to monitor our application is the JVM Micrometer dashboard with import id: 4701.
Give your dashboard a custom name and select the prometheus data source we configured in step 3.
Now we have a fully pre-configured dashboard, with some important metrics showcased, out of the box.
ADDING A CUSTOM METRIC PANEL
To demonstrate how we can create a panel for one of our own custom metrics, I will list the required steps below.
First we need to add a panel by clicking on “add panel” on the top of the page, and yet again on “add new panel” in the center.
Then we need to configure our panel, which we do by selecting demo_gauge
in the metrics field.
To display our graph in a prettier way, we can choose the “stat” type under the visualization tab.
When we click on Apply
in the top right corner, our new panel gets added to the dashboard.
Afterwards, we can do the same thing for our demo_counter
metric.
After going through all of these steps, we now have an operational dashboard which monitors our Spring Boot application, with our own custom metrics.