Use of a time series prediction model
Introduction
This tutorial will explain the main steps to follow to use a trained model as a service. The model will be available to run with the data passed by the user or by the application.
The model that was produced in production will be the one created in the tutorial Creation of a time series prediction model with Prophet. It is a predictive model with Prophet. These types of models are used to make predictions based on a time series. In this case, it will use the NO2 data from a Madrid weather station to train the model.
Model as a service
The objective is to have the features of the model by passing parameters to make predictions about desired time intervals. These calls to the model can be made by a user or by an application.
To expose the trained model as a service, you will use the concepts explained in Running a Notebook via its REST API, where you can see the execution of a Notebook through its REST API.
As for the results of the model, those will be presented graphically, so that the result of the execution will be a graph with the prediction data of the given time series. In this case, use the platform's model toolkit to call the model and see the results.
Access to the Platform's Notebooks
Firstly, you must access the Platform's CloudLab Environment with an Analytics role.
Once in the platform control panel, access the Notebooks in the Analysis Tools menu.
Lastly, create a new notebook.
The Notebook
Within the notebook, several main tasks are performed:
- Parameter reception
- Environment activation
- Model load
- Prediction and formatting of output data
Parameter reception
When a notebook is executed by using a call to its REST API, all of its paragraphs are always executed, being the first one for parameter reception, and the last one for the results output.
In this entry paragraph, we obtain the parameters from the "z" variable (Zeppelin-Context object) and set them as context variables.
This step is very important for the parameters that are available to any interpreter from any paragraph.
The parameters for this model are:
- startdate: Prediction start date (yyyy-mm-dd).
- enddate: Prediction end date (yyyy-mm-dd).
Some examples are also provided for the performance test.
Environment activation
Activate the Python virtual environment with the necessary libraries available. This necessary environment is created in the tutorial mentioned in the introduction.
Model load
The necessary imports are made and variables are established to access the files easily. The files correspond to those generated in the previous tutorial:
- Historical data used for training (csv) (necessary in this case for data representation).
- Serialized trained model (pkl).
Before loading the model, perform a date validation. These dates are modified in case the parameters are incorrect. Several rules were established:
- Correct order of dates.
- Dates after 2011.
- Interval less than one week.
With the parameters correctly, load the model (file created in the tutorial mentioned in the introduction) from the pkl file.
Prediction and formatting of output data
Several data sets are required to represent the results:
- Historical data not matching the prediction.
- Historical data matching the prediction.
- Prediction data.
These sets are necessary for a correct choice of the parameters.
Historical data is loaded.
The prediction time series for the model is created from the start_date and end_date parameters.
Data on the time series created using the model is predicted.
The data set (called validation) corresponding to the historical and validation data that match over time is created.
And finally of the historical data after the prediction.
The final visualization, which corresponds to the last paragraph (used for the output of results in the REST API), is created by joining the different data sets. In this way, predictions and historical data are always represented on the matching dates. In this case, we see an example in which there is only one part with historical data.
Accessing to the Platform's Models
From the platform's control panel, access the Models in the Analytics Tools menu.
Finally, create a new Model.
Model interface
Within the model creation menu, establish the fields by selecting the Notebook that you have created previously.
In this case, create the parameters as shown below:
The parameters must have the same name as those collected in the "z" variable inside the Notebook.
We set the OUTPUT SOURCE to OUTPUT PARAGRAPH ID (last paragraph of the Notebook) where we create the final visualization graphic.
NOTE: It may happen that the parameters are not validated correctly (validation error). In that case, the model will be created but without parameters. Just edit the model and set them again.
Model call
Next is a call to a model like the one you just created with some sample dates.