Elasticsearch: OpenDistro migration for Elasticsearch

Disponible a partir de la versión 3.0.0-Jailbreak.

The free version of Elasticsearch provided by the Elastic company, which is its developer, does not have many basic features that are necessary for a production deployment. For example, it has no security or ssl support for communications.

For this reason, we have integrated OpenDistro into the platform. OpenDistro is an Elasticsearch distribution that is Apache 2.0 and that does have security, ssl support, sql support, etc. It is maintained by Amazon and is the one they use on AWS.

https://opendistro.github.io/

In addition to this migration, the configuration possibilities of defined ontologies on Elasticsearch have been increased:

Configuration of Shards and Replicas

We have added the possibility of not only configuring the default values for these fields, but also being able to set values for each ontology:

Template configuration

To store audit data, logs, timeseries, etc. Elasticsearch recommends creating multiple indexes for blocks of time, especially if the data is more or less homogeneous for that block of time. For example, create an index by month, or by week, or by day, etc.

To be able to do this, Elasticsearch provides a number of elements that we have also added to the configuration of the ontologies:

  • Elasticsearch allows you to create templates instead of indexes associated with a pattern - for example, if you create a template associated with log-*, any insert that is made to an index that matches the pattern, for example, log-2021-02-18, it will create the index, (if it does not already exist) using the mapping configured in the templates .

  • Elasticsearch allows you to create alias to make the queries, in such a way that an alias indicates several indexes. When doing the query, the alias is used, and the query is executed on all shards of all the indexes, which is actually exactly the same, performance-wise, as having an index with the same number of total shards. In the templates, you can specify that, when an index is created, it is included in a certain alias.

In this new version we have added the possibility of creating ontologies as an Elasticsearch template, where we will specify the criteria for index generation:

As can be seen in the previous image, you have to specify a field used for the generation of the index as well as a function to apply to it.

The functions allowed are:

  • NONE: Creates the index as the concatenation of the ontology name + field value.

  • SUBSTRING: Creates the index as the concatenation of the ontology name + substring (field value, start, end).

  • Other temporary functions: Extract the data for the day, month and/or year depending on the selection. These functions require the field to be of type Timestamp.

Let's say we have an ontology “Ontology” defined as in the previous field. When a data is inserted with the field Ontology.timestamp=”2021-04-08T10:10:00Z”, it will automatically insert the data in the “Ontology-2021-04” index, having as many indexes as year/month there are in the data - all this, transparently to the user.

As for the query, when the ontology is created, together with the creation of the template, an alias is assigned with the name of the ontology, and it includes all the indexes that can be created according to the previous rule.