A look at the Semantic Data Hub

Since version 3.1.0-KickOff, Ontologies have been referred to as "Entities" in the Control Panel. This does not alter any functionality; the nomenclature has simply been changed for a better understanding of the concept.

Onesait Platform proposes a Data Centric Architecture, i.e. an architecture where data is the main and permanent asset, while applications come and go. In these architectures the data model precedes the implementation of any given application and will be valid long after they are replaced.

In the platform it is supported through the concept of Entity (also called Ontology), this concept is the one that allows to isolate the platform from the underlying persistence technologies by providing a SQL abstraction layer. The information coming from the ingestion and processing is stored in the Semantic Data Hub of the platform, as we have said it is the persistence engine of the platform.

Its main features:

  • The Platform allows, depending on the requirements of each project (volumes of information in real and historical time, mostly read or write access, more analytical processes, previously existing technologies in customers, etc.) select the persistence engine more adequately, conceptually these concepts are handled on platform:

    • Real Time Database (RealTimeDB): It is designed to support a large volume of inserts and online queries very efficiently. The platform abstracts from the underlying technology allowing to use BD as Mongo, Elastic, Kudu, relational BD, ...

    • Historical and Analytical Database (HistoricalDB): It is designed to store all the information that is no longer part of the online world (for example, information from previous years that is no longer consulted) and to support analytical processes that extract knowledge of this data (algorithms). The platform supports technologies such as HIVE (with Impala or Presto for online consultations), SAP HANA, BigQuery, CosmosDB, SparkSQL, ...

    • Staging Area: It allows you to store raw and unprocessed files by the platform, for later intake. HDFS is usually used as storage although the platform allows others like GridFS to be used.

    • GIS Database:  It allows the storage of geographic information and its subsequent consultation and processing, the platform supports MongoDB, Elastic, PostGIS among others.

  • All the definition and management of these entities is done from the Control Panel of the Platform:

    • List of Entities / Ontologies (seen by Administrator Role) with visibility levels, ...

  • Entity Creation Wizard (Ontology), as you can see the platform offers help to create these in a guided way, allowing you to upload a file from which the platform will extract the information to model the entity and upload this file, step by step creation, connection with a relational database:

  • Creation of the Entity: when I create the entity I can give it a name, tag it, activate or deactivate it, make it public, assign access permissions (who can see it, who can insert data, ...), define audit, ... 

If I select Creation step by step, I can define the Entity's attributes and their type (these types are transparently mapped by the platform towards the selected persistence: HIVE, Mongo, Elastic, relational BD, ...)

We can also see how one of the templates can be selected as the basis for the creation of my entity (this is optional) and how you can select an attribute to be encrypted, in this case the platform will store this encrypted attribute in the selected repository and allow decrypt it at the time of returning it.

  • When I believe the Entity can do an advanced management of the Entity:

 

In which I can map the entity to a Kafka topic, indicate if the entity's data should be automatically deleted after a while or go to the historical database and indicate the repository on which I wish to store the Entity, in the example they look different repositories, these repositories are configured based on the persistence modules deployed.

Â