Virtual Buckets on S3

Virtual Buckets on S3

Available since Release 6.0.0

Introduction

This functionality allows a platform administrator user to segment an S3 bucket (either AWS S3 or MinIO) in virtual buckets assigned to different platform users.

In this way, without the need to have different physical buckets, I can use each virtual bucket for a specific topic, having separated its uses (datamart, staging,...).

How to use it?

Required set-up configuration

In order to enable this functionality, it’s necessary to complete some previous steps:

  • Create a new metastorage service in the CaaS configured for this S3 repository. It’s possible to use the same Presto Metastorage image of the plataform. In this tutorial, we’ll create a new service called presto-metastore-server-aws. The current version of the image is:

presto-metastore-server:5.0.0

And by setting the enviroment vars to AWS and the service URL

- MINIO_ROOT_USER → with the Access Key

- MINIO_ROOT_PASSWORD → secret key

- MINIO_SERVER_ENDPOINT → endpoint http/https of S3 service

image-20240326-140318.png

This will result in the service up and running for AWS

image-20240326-140736.png
  • Configure the S3 system in centralized configuration of the platform. In Platform configuration setting set the path in onesaitplatform/env/externals3 property

image-20240326-142942.png

By default, this settings will be already configured, but it must be them set like this:

onesaitplatform/env/database/prestodb-externals3-catalog → presto catalog name (externals3 by default)

onesaitplatform/env/database/prestodb-externals3-schema → presto schema name (default by default)

  • Create a new presto catalog in platform (with the same name of the onesaitplatform/env/database/prestodb-externals3-catalog setting) and it must be configured for the previous created metastorage url (hive.metastorage.url setting) with the following settings:

image-20240326-154034.png

Creation of AWS S3 Bucket

After previous steps and with the right credentials of AWS, next step will be access to the AWS console:

image-20240326-201026.png

Then, navigate to the Amazon S3 page

image-20240326-201152.png

And finally, click on “Create bucket“ button in order to access the creation form. Inside that, we’ll fill all the inputs and create our AWS bucket:

image-20240326-201436.png

After that, the system notify us with the creation of the bucket and it’ll apear in the bucket list

image-20240326-201647.png

Asocciated Virtual Bucket creation in plataform

In platform, with an user with administrator role, we’ll navigate to the Virtual Buckets Management section

image-20240326-202757.png

We’ll click on create and we’ll start to fill all the input fields. We’ll also can see all the AWS Buckets in S3 Bucket Name dropdown

image-20240326-202842.png

We’ll select the new created bucket and fill all the input fields

image-20240326-203015.png

After clicking on create button we’ll see the detail and the full generated path

image-20240326-203123.png

In this moment, it’ll be important to authorise to some user in order to use this new Virtual Bucket for entity creation, in the example it’ll be create entities in path “data/input” in thr AWS Bucket onesaitdatamart

image-20240326-203631.png

Entity creation in the Virtual Bucket

Finally, we can create the entity in this Virtual Bucket with the authorised user of the previous step.

When we log in we navigate to the virtual bucket list that show us which bucket are allowed to use. With this user we're not allowed to create, edit or delete Virtual Buckets,

image-20240326-204200.png

In order to create the new entity for this Virtual Bucket, we’ll navigate to the “Create entity in Historical Database“ option:

image-20240326-204404.png

After that, the Create Entity from Virtual Bucket will appear

image-20240326-204513.png

Using a similar way to the historical entities, we’ll fill the different options of the form for the entity creation.

image-20240326-210724.png

Below, we can select the entity location with the Virtual Bucket Identification dropdown. After that, if we update the SQL (with Update SQL button) we’ll see the new SQL sentence with the real bucket location in EXTERNAL_LOCATION attribute.

image-20240326-211013.png

Finally, if we click on create button, we’ll create our new entity

image-20240326-211112.png

If we navigate to the AWS Console, we can see how the full path for the new entity has been created

image-20240326-211244.png

Entity operations

It will be posible to insert data that will appear as a new file in the AWS S3

image-20240326-211703.png

 

image-20240326-211741.png

And query them:

image-20240326-211846.png