/
Improvements in monitoring

Improvements in monitoring

Available since version 6.3.0-Yoshi

Introduction

For scenarios in which there is no corporate monitoring, the Platform integrates various tools and technologies to monitor its state of health.

This monitoring is based on the deployment of Grafana + Grafana Mimir + Prometheus Operator on a Kubernetes environment.

The following improvements have been incorporated in version 6.3.1:

  • Standardisation of the metrics of all the components of the Platform.

  • Grafana Dashboards of all the modules of the Platform.

  • Ability to generate alerts.

Monitoring Tools

image-20250107-125708.png
  • Grafana is an open source tool for real-time data visualisation and analysis, which allows you to create interactive dashboards and monitor system, application and database metrics.

https://grafana.com/
  • Grafana Mimir is a large-scale metrics storage and management platform designed to offer high performance and scalability in the collection and querying of monitoring data in distributed environments.

  • Prometheus Operator is a tool for managing and deploying Prometheus instances on Kubernetes, facilitating the automated and efficient configuration, monitoring and maintenance of Prometheus clusters.

Therefore, Prometheus Operator is in charge of collecting the metrics of the different components of the Platform, Grafana Mimir acts as a database and stores those collected metrics, and Grafana acts as a consultant and visualiser of the metrics in a dashboard or just in simple queries, as well as having other options such as alerts.

The deployment of the tools is done through ArgoCD, which is a continuous delivery (CD) tool for Kubernetes that automates the deployment and management of applications, using Git as the source of truth for configuration and desired state.

Grafana Dashboard

A Grafana dashboard is a visual interface that displays real-time data through graphs, tables and other widgets, allowing you to monitor and analyse system and application metrics.

With Grafana dashboards you can perform, among other things:

  1. Create visual dashboards: Design dashboards with graphs, charts, maps and other widgets.

  2. Visualise metrics in Real Time: Displaying updated data in real time from various sources.

  3. Filter Data: Apply filters and variables to customise the display of metrics.

  4. Configure multiple Data Sources: Connect and display data from different systems or databases.

  5. Create interactive Layouts: To organise and adjust dashboards flexibly for a clear presentation.

  6. Share dashboards: Export or share them with other users or teams.

  7. Customise dashboards: Adjust colours, value ranges and other visual aspects.

  8. Add Alerts: Integrate alerts directly into dashboards to monitor specific metrics.

  9. Data History: Visualise trends and historical data for long-term analysis.

Configured dashboards can be persisted in the Grafana database, or directly injected as yaml via configMaps in Kubernetes on the chart itself, as is the case.

Explore

The explore section is a tool that allows you to explore and consult data in an interactive and ad-hoc way. It is used for quick queries and analysis of metrics or logs in real time, without the need to set up full control panels. This is useful for debugging problems and getting instant feedback from connected data sources.

This tool can be found under Grafana > Explore.

image-20250107-133429.png
Example of Grafana Explore.

Data Sources

A data source is an external data source (such as databases, monitoring systems or APIs) that Grafana uses to query and visualise metrics or information. Common examples include Prometheus, InfluxDB, MySQL and ElasticSearch.

In this case, the data source used is Prometheus, but it is stored in the Grafana Mimir database:

These sources are accessed in Grafana > Connections > Data source.

Dashboards offered by Onesait Platform

  • Data Bases:

    • MySQL/MariaDB.

    • MongoDB.

    • PosgreSQL.

    • ElasticSearch/OpenSearch.

  • Modules:

    • JVM/JMX (todos los componentes).

    • Keycloak.

    • Kafka.

    • DataFlow.

    • FlowEngine NodeJS.

    • Presto.

    • MinIO.

    • Nginx.

  • Kubernetes:

    • Deployments / Pods.

    • Nodes.

    • Nginx Controller.

Example Metrics

Alerts in Grafana

The main novelty with respect to the monitoring already offered with the Platform, in addition to the standardisation of the metrics and the control panels of all the modules, is the incorporation of alerts through Grafana Alerts.

Grafana Alerts is a functionality that allows the configuration of automatic notifications based on thresholds or metric conditions, alerting users about events or problems in real time. Notifications can be sent via email or Teams in addition to many other options.

With Grafana Alerts you can do the following things:

  • Set Alerts: Define conditions based on specific metrics.

  • Set Thresholds: Define values that, when exceeded, trigger an alert.

  • Notifications: Send alerts to different channels (email, Slack, PagerDuty, etc.).

  • Multiple Alerts: Create several alerts for the same panel or graph.

  • Silence Alerts: Configure periods in which notifications will not be sent.

  • Alert History: Consult and manage past alerts.

  • Alert Escalation: Configure alerts according to severity and escalate them to different teams or individuals.

  • Complex Conditions: Establish combinations of conditions to trigger alerts, such as combining several thresholds or metrics.

Contact Points

To view or configure the different contact points (e-mails, Microsoft Teams, etc.), go to Grafana > Alerting > Contact Points.

Viewing Alerts

To view or configure the different Alerts in the Grafana system, go to Grafana > Alerting > Alert Rules.

In the following example you can see the different groups of alerts that are configured, such as health check alerts, to know if the databases to which the Platform is connected are down:

In the following example you can see the alert received via email when a microservice is down:

 

Related content