Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

Introduction

Starting from data of Diabetes we’re going to generate the corresponding model that predict a quantitative measure of disease progression one year after baseline. We’re going to use:

  • MinIO file system for save the original dataset. We’ll load the file using the Create Entity in Historical Database

  • Notebooks module to have a parametric process for get the data from MinIO, train and generate the model and log everything in MLFlow

  • Models Manager (MLFlow) for log all the experiment of the notebook and save the model and other files for the training

  • Serverless module to create a rest scalable python function that using the model can predict the disease progression

...

Dataset

The information of the diabetes dataset is the following.

----------------

 

Ten  Ten baseline variables, age, sex, body mass index, average blood

...

Note: Each of these 10 feature variables have been mean centered and scaled by the standard deviation times `n_samples` (i.e. the sum of squares of each column totals 1).

Source URL:

https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html

...

(https://web.stanford.edu/~hastie/Papers/LARS/LeastAngle_2002.pdf)'

Step 1: Load data into MinIO platform

From the above link, we’re going to get the file from this source https://www4.stat.ncsu.edu/~boos/var.select/diabetes.tab.txt

...

Also we can query this entity throw presto engine with the query tool:

...

Step 2: Create notebook for get data, train and log the experiment

First of all, we go to create a new notebook, we go to the Analytics Tools option and we click in the new notebook (+) button we type a name for it

...

Finally, we evaluate some metric for the output of the prediction

...

Step 3: Log training and model data in MLFlow

The notebook module is integrated with MLFlow tracking serve so, the only thing that we need to do in the notebook is import the necessary “MLFlow” lib and use the MLFlow tracking functions. That will be do in the import libs section

...

And if we go to the model tab we can see it and working with it

...

Step 4: Create a Serverless python function that eval data against the MLFlow model

With previous generated model we’re going to create a deploy a python function that, with an simple or multiple input, can get a prediction using the model.

...