Info |
---|
New Release 1.4.12On Feb 10, 2025, version 1.4.12 of the Python client library was released on PyPi: https://pypi.org/project/onesaitplatform-client-services/#history |
This tutorial shows the installation of the Python REST client for OnesaitPlatform, as well as some use cases.
Introduction
This Platform client library contains several classes that implement the functionalities of different platform clients:
Iot-Broker Client: Queries and insertions of ontology data.
Api-Manager client: Management of REST APIs and calls.
File-Manager Client: Upload and download of binary files.
MQTT Client: Queries and insertions of ontology data using MQTT protocol.
BaseModelService: Management of Machine/Deep Learning models
With these clients, a simple communication with the platform from any Python environment is established.
These same functionalities are also established from the Digital Broker or Control Panel, through its various paths.
Library installation from pypi:
The easiest way to install the library is by using the pypi repository and using pip. To install it, just run the command:
>> pip install onesaitplatform-client-services
It is recommended to use the upgrade flag to update the dependencies in case they are already installed (recommended for all installations).
>> pip install --upgrade onesaitplatform-client-services
Client use examples
Sample notebooks can be found in the /examples/*.ipynb library installation folder.
NOTE: Very important: For every DigitalClient, always do a join() at the beginning to open the connection and leave () at the end to close it.
Digital Client
Create Client
To create a Digital Client, a connection configuration is required. We will next explain how to properly set up the parameters.
If the client connects from within the Platform's network (from platform notebooks or internal microservices), then the configuration to connect to the Digital Broker (Iot Broker) is as follows:
Code Block | ||
---|---|---|
| ||
%python from onesaitplatform.iotbroker import DigitalClient HOST = "iotbrokerservice" PORT = 19000 IOT_CLIENT = <digital client name> IOT_CLIENT_TOKEN = <digital client token> PROTOCOL = 'http' AVOID_SSL_CERTIFICATE = False client = DigitalClient(host=HOST, port=PORT, iot_client=IOT_CLIENT, iot_client_token=IOT_CLIENT_TOKEN) client.protocol = PROTOCOL client.avoid_ssl_certificate = AVOID_SSL_CERTIFICATE |
However, if the client is used from an external platform network (local PC and other network locations ), then the configuration is as follows:
The parameter is automatically redirected so this is not necessary, but you can set PORT = 443 (default https port).
The PROTOCOL parameter is set to "https" by default, but you can be set it up to PROTOCOL = "https".
The AVOID_SSL_CERTIFICATE parameter is set to True by default, but it can be set to AVOID_SSL_CERTIFICATE = True.
Code Block | ||
---|---|---|
| ||
import json from onesaitplatform.iotbroker import DigitalClient HOST = "lab.onesaitplatform.com" PORT = 443 IOT_CLIENT = "RestaurantTestClient" IOT_CLIENT_TOKEN = "f633128219f54a37b23409e7f4173100" PROTOCOL = 'https' AVOID_SSL_CERTIFICATE = True client = DigitalClient(host=HOST, port=PORT, iot_client=IOT_CLIENT, iot_client_token=IOT_CLIENT_TOKEN) client.protocol = PROTOCOL client.avoid_ssl_certificate = AVOID_SSL_CERTIFICATE |
Join / Leave
To start a connection, you need to do either a join() or a connect(), which uses the iot_client and iot_client_token credentials. At the end, you must always use either leave() or disconnect().
Code Block | ||
---|---|---|
| ||
client.join() client.leave() |
Query
The supported query formats are "SQL" and "NATIVE" for SQL syntax and MongoDB respectively.
Code Block | ||
---|---|---|
| ||
query = "select * from Restaurants limit 3" ok_query, results_query = client.query(ontology="Restaurants", query=query, query_type="SQL") query = "db.Restaurants.find().limit(3)" ok_query, results_query = client.query(ontology="Restaurants", query=query, query_type="NATIVE") |
Query batch
This query makes successive requests of smaller size, being optimal when you want to request a lot of data. It is built with query + "offset N limit batch_size" o query + .skip(N)-limit(batch_size)" for SQL and MongoDB respectively.
Code Block | ||
---|---|---|
| ||
query_batch = "select c from Restaurants as c" ok_query, results_query = client.query_batch(ontology="Restaurants", query=query_batch, query_type="SQL", batch_size=50) query_batch = "db.Restaurants.find()" ok_query, results_query = client.query_batch(ontology="Restaurants", query=query_batch, query_type="NATIVE", batch_size=50) |
Queries on RealTimeDB mapped to file
An entity on the Real Time DB can store a large amount of information, depending on the purpose of the entity. For example, let's assume an entity that receives events from a network consisting of hundreds of devices.
To avoid memory overflow issues with queries that do not apply limit criteria on entities with a large number of records, the maximum number of records that such queries will return is configured at the installation level. For example, in a typical installation with default values, the following query:
Code Block | ||
---|---|---|
| ||
select * from MiEntidad; |
It will return at most 2000 records, regardless of whether the entity has millions of records.
To facilitate the handling of such queries in projects, a function is included for executing queries without limit clauses or with very high limits, whose result is dumped into a file that is available to users. In the Digital Broker, this would correspond to the path: /rest/ontology/{ontology}/file.
The function query_file() will be used, which requires four parameters:
ontology: the entity on which the operation is performed.
query: the query to execute.
queryType: the type of query (SQL or NATIVE).
responseType: the destination of the file to be generated, which can be one of three types:
DISK: the file will be generated in a directory configurable by installation within the Semantic Inf Broker container, which can be mapped to a shared volume so the file is available for other modules, such as Dataflow, Notebooks, etc.
S3_MINIO: the file is generated in the MinIO bucket of the Digital Client owner.
URL: the file is generated in a directory mapped to a shared folder in the platform load balancer, allowing it to be downloaded via a URL.
Code Block | ||
---|---|---|
| ||
query = "SELECT c.ejemplo_smrg AS ejemplo_smrg FROM ejemplo_smrg AS c" ok_query, result_query = client.query_file(ontology="ejemplo_smrg", query=query, query_type="SQL", responseType="S3_MINIO") print(f"ok_query_file: {ok_query}") print(f"result_query_file: {result_query}") |
This will return a JSON with the information on where to locate the file, depending on the responseType:
DISK:
URL:
S3_MINIO:
Using S3_MINIO, the file will be accessible in the user's bucket.
The service response is asynchronous, as queries on large entities can have high execution times. To handle this, the response JSON includes an identifier that allows checking the execution status of the query, which can be:
IN_PROGRESS: the query is still executing.
FINISHED: the query has finished, and its result has been fully dumped to the file.
The operation that allows checking the status is available in the Digital Broker at the path /rest/ontology/file/{queryId}/status. In Python, it can be done as follows:
We need access to the 'query_id' of the file whose status we want to check. This 'query_id' can be obtained directly from the JSON returned by the request or, if it's a previously created file, manually entered:
Code Block | ||
---|---|---|
| ||
query_id = result_query['queryId'] query_id = "9c699b1c-3fb1-4075-a60a-fd982fbd39b7" |
Next, the function will be called to return the status. If the status is 'IN_PROGRESS', it will wait for 5 seconds corresponding to the 'timesleep' before making the request again to check if it has changed to 'FINISHED'. The function will have a preset timeout of 60 seconds to wait for it to change to 'FINISHED', but, like the 'timesleep', these are parameters that can be modified.
Code Block | ||
---|---|---|
| ||
status_query = client.query_file_status(queryId=query_id) status_query = client.query_file_status(queryId=query_id, timeout=10, sleep_time=1) |
Insert
Code Block | ||
---|---|---|
| ||
new_restaurant = { 'Restaurant': { 'address': { 'building': '2780', 'coord': [-73.98241999999999, 40.579505], 'street': 'Stillwell Avenue', 'zipcode': '11224' }, 'borough': 'Brooklyn', 'cuisine': 'American', 'grades': [ {'date': '2014-06-10T00:00:00Z', 'grade': 'A', 'score': 5} ], 'name': 'Riviera Caterer 18', 'restaurant_id': '40356118' } } new_restaurant_str = json.dumps(new_restaurant) new_restaurants = [new_restaurant] ok_insert, res_insert = client.insert("Restaurants", new_restaurants) |
Api-Manager Client
Create Client
To create the Api-Manager instance, the Host and the user's Token are required, whose information we want to retrieve.
Code Block | ||
---|---|---|
| ||
from onesaitplatform.apimanager import ApiManagerClient HOST = "www.onesaitplatform.com" TOKEN = "b32522cd73e84ddda519f1dff9627f40" client = ApiManagerClient(host=HOST)+ client.setToken(TOKEN) |
Find APIs
By using the find()
method, we can locate a specific API, which returns the API information in JSON format, where its endpoints and other details can be observed. To do so, it is necessary to pass two parameters:
identification: which corresponds to the name of the API.
version: which corresponds to the version of the API.
Code Block | ||
---|---|---|
| ||
ok_find, res_find = client.find(identification=apiManger_test, version=1) |
List APIs
This method lists all the user's APIs and the public ones if no user is provided, and if a user is provided, only the user's APIs are listed.
Code Block | ||
---|---|---|
| ||
ok_list, res_list = client.list()
ok_list, res_list = client.list(user="smartinroldang") |
Make API request
With this method, you can make requests to any API you wish, specifying:
method: the type of HTTP method.
name: the query you want to perform.
version: the version of the API.
body: the necessary information for the request, if required.
Code Block | ||
---|---|---|
| ||
ok_request, res_request = client.request(method="GET", name="RestaurantsAPI/", version=1, body=None) |
File-Manager Client
Create Client
When client is created, it is possible to use Bearer token (APIs) or user token (My APIs>user tokens)
Code Block | ||
---|---|---|
| ||
import json from onesaitplatform.files import FileManager HOST = "www.lab.onesaitplatform.com" USER_TOKEN = "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJwcmluY2lwYWwiOiJianJhbWlyZXoiLCJjbGllbnRJZCI6Im9uZXNhaXRwbGF0Zm9ybSIsInVzZXJfbmFtZSI6ImJqcmFtaXJleiIsInNjb3BlIjpbIm9wZW5pZCJdLCJuYW1lIjoiYmpyYW1pcmV6IiwiZXhwIjoxNjE3ODI2NjkzLCJncmFudFR5cGUiOiJwYXNzd29yZCIsInBhcmFtZXRlcnMiOnsidmVydGljYWwiOm51bGwsImdyYW50X3R5cGUiOiJwYXNzd29yZCIsInVzZXJuYW1lIjoiYmpyYW1pcmV6In0sImF1dGhvcml0aWVzIjpbIlJPTEVfREFUQVNDSUVOVElTVCJdLCJqdGkiOiJmNGM2NDUzZC0xYTEyLTRkMGUtYTVlNy05ZmNlMDY4OTY1NDYiLCJjbGllbnRfaWQiOiJvbmVzYWl0cGxhdGZvcm0ifQ.Nz5cDvMjh361z4r6MMD2jUOpYSmUKVLkMThHDK0sg6o" file_manager = FileManager(host=HOST, user_token=USER_TOKEN) file_manager.protocol = "https" |
Upload file
Code Block | ||
---|---|---|
| ||
uploaded, info = file_manager.upload_file("dummy_file.txt", "./dummy_file.txt") |
Download file
Code Block | ||
---|---|---|
| ||
downloaded, info = file_manager.download_file("5ccc4b34f2df81000b8f49a") |
Download file MinIO
Once the file is created with S3_MINIO, we will see how to retrieve it from Python and how to generate a Pandas dataframe from it.
To retrieve the file, we first create an instance of the FileManager class, so we import that library. Additionally, to work with the dataframe, we import the pandas library. Let's start by making these imports:
Code Block | ||
---|---|---|
| ||
from onesaitplatform.files import FileManager import pandas as pd |
Next, we create the instance of FileManager as explained earlier. Once the instance is created, we proceed to download the file using the download_file_minio() function. This function requires the file path, which was provided by the previous request to S3_MINIO. The path has a structure similar to this: 'smartinroldangbucket/s3_queries/prueba_smrg_2025-01-24_12:45:48.046.json' (we use this path for the example). The function returns a FileManager object, so to obtain only the file's information, we append .json_data at the end, so the expression looks as follows:
Code Block | ||
---|---|---|
| ||
data = file_manager.download_file_minio(filepath="smartinroldangbucket/s3_queries/prueba_smrg_2025-01-24_12:45:48.046.json").json_data |
To ensure the dataframe is displayed correctly, it is necessary to 'flatten' the data. For this, we use the .to_pandasdataframe() function, which can be used in two ways: by directly passing the 'data' returned by the previous function or by chaining the functions. These are both ways:
Passing the 'data':
Code Block | ||
---|---|---|
| ||
data = file_manager.to_pandasdataframe(data) |
Chaining the functions:
Code Block | ||
---|---|---|
| ||
data = file_manager.download_file_minio(filepath="smartinroldangbucket/s3_queries/prueba_smrg_2025-01-24_12:45:48.046.json").to_pandasdataframe() |
If no separator is added, the default value is the dot ('.'). However, you can specify the separator you want by using the 'sep' parameter. For example, to change it to a hyphen ('-'), it is done as follows:
Code Block | ||
---|---|---|
| ||
data = file_manager.to_pandasdataframe(data, "-") data = file_manager.download_file_minio(filepath="smartinroldangbucket/s3_queries/prueba_smrg_2025-01-24_12:45:48.046.json").to_pandasdataframe(sep="-") |
At this point, we already have the information structured in a way that allows the dataframe to be displayed correctly. Now, if we execute it:
Code Block | ||
---|---|---|
| ||
df = pd.DataFrame(data) print(df) |
What would be printed on the screen would look like this:
BaseModelService
BaseModelService is a class that manages the whole lifecycle of ML/DL models that can be trained and deployed in Onesait Platform. It is a generalistic base class intended to be mother of more specific classes that deal with scpecific models. For example, to manage a model of Sentiment Analysys, a class SentimentAnalysisModelService can be created. This second class must inherit from BaseModelService. BaseModelService is in charge of the interaction with different components of Platform: getting datasets from File Repository or ontologies, saving the resulting models in File Repository, report the creation of a new model in the corresponding ontology, and so on. The developer od the specific class SentimentAnalysisModelService doesn't need to bother about those issues and can focus just in training and inference code:
Code Block | ||
---|---|---|
| ||
from onesaitplatform.model import BaseModelService class SentimentAnalysisModelService(BaseModelService): """Service for models of Sentiment Analysis""" def __init__(self, **kargs): """ YOUR CODE HERE """ super().__init__(**kargs) def load_model(self, model_path=None, hyperparameters=None): """Loads a previously trained model and save it a one or more object attributes""" """ YOUR CODE HERE """ def train(self, dataset_path=None, hyperparameters=None, model_path=None): """ Trains a model given a dataset and saves it in a local path. Returns a dictionary with the obtained metrics """ """ YOUR CODE HERE """ return metrics def predict(self, inputs=None): """Predicts given a model and an array of inputs""" """ YOUR CODE HERE """ return results |
Once the class has been implemented, it can be used to build an object able to train models and predict with them:
Code Block | ||
---|---|---|
| ||
PARAMETERS = { 'PLATFORM_HOST': "lab.onesaitplatform.com", 'PLATFORM_PORT': 443, 'PLATFORM_DIGITAL_CLIENT': "SentimentAnalysisDigitalClient", 'PLATFORM_DIGITAL_CLIENT_TOKEN': "534f2eb845c746bd9a50cfab30273317", 'PLATFORM_DIGITAL_CLIENT_PROTOCOL': "https", 'PLATFORM_DIGITAL_CLIENT_AVOID_SSL_CERTIFICATE': True, 'PLATFORM_ONTOLOGY_MODELS': "SentimentAnalysisModels", 'PLATFORM_USER_TOKEN': "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJwcmluY2lwYWwiOiJianJhbWlyZXoiLCJjbGllbnRJZCI6Im9uZXNhaXRwbGF0Zm9ybSIsInVzZXJfbmFtZSI6ImJqcmFtaXJleiIsInNjb3BlIjpbIm9wZW5pZCJdLCJuYW1lIjoiYmpyYW1pcmV6IiwiZXhwIjoxNjE3ODI2NjkzLCJncmFudFR5cGUiOiJwYXNzd29yZCIsInBhcmFtZXRlcnMiOnsidmVydGljYWwiOm51bGwsImdyYW50X3R5cGUiOiJwYXNzd29yZCIsInVzZXJuYW1lIjoiYmpyYW1pcmV6In0sImF1dGhvcml0aWVzIjpbIlJPTEVfREFUQVNDSUVOVElTVCJdLCJqdGkiOiJmNGM2NDUzZC0xYTEyLTRkMGUtYTVlNy05ZmNlMDY4OTY1NDYiLCJjbGllbnRfaWQiOiJvbmVzYWl0cGxhdGZvcm0ifQ.Nz5cDvMjh361z4r6MMD2jUOpYSmUKVLkMThHDK0sg6o", 'TMP_FOLDER': '/tmp/', 'NAME': "SentimentAnalysis" } sentiment_analysis_model_service = SentimentAnalysisModelService(config=PARAMETERS) |
Code Block | ||
---|---|---|
| ||
MODEL_NAME = 'sentiment_analysis' MODEL_VERSION = '0' MODEL_DESCRIPTION = 'First version of the model for sentiment analysis' DATASET_FILE_ID = '605360b7cfb6d70134a3b1a0' HYPERPARAMETERS = { 'NUM_WORDS': 10000, 'BATCH_SIZE': 16, 'EPOCHS': 10, 'DROPOUT': 0.2, 'LEARNING_RATE': 0.001, } sentiment_analysis_model_service.train_from_file_system( name=MODEL_NAME, version=MODEL_VERSION, description=MODEL_DESCRIPTION, dataset_file_id=DATASET_FILE_ID, hyperparameters=HYPERPARAMETERS ) |
Code Block | ||
---|---|---|
| ||
sequences = ['Esta es una opinión muy buena', 'Esta es una opinión muy mala'] results = sentiment_analysis_model_service.predict(inputs=sequences) |