Interpreter configuration in the Notebooks module
The platform's Notebooks engine is based on Zeppelin. In Zeppelin, support of the different languages ​​is given through Interpreters.
Zeppelin structure
To understand how Zeppelin works, its structure must be considered (Zeppelin structure explanation).
Zeppelin has three main components:
- Notebooks.
- Managers (Java virtual machine that manages executions and where the context is located).
- Intérpretes for each language.
When a paragraph is executed, the flow that is followed is:
- The notebook sends the code to the manager.
- This manager manages the context and sends the code to the corresponding interpreter to execute the code.
- The interpreter executes the code and returns the result to the manager.
- The manager returns the result to the notebook.
Managers and interpreters can be configured (Interpreter Binding Mode) depending on the use that is given to them (a single Data Scientist, several ones, ...) and the available infrastructure.
The meaning of these configurations is explained below.
Interpreter Settings
The interpreter instance configuration refers to how the interpreters are executed within the installation.
Each interpreter corresponds to a process running in the system, which can be created globally (single process), by the user (one process by each user) or by a note (one process per notebook).
The features of each option are shown below.
Features per interpreter instance | Globally | Per Note | Per user |
---|---|---|---|
Parallel execution | Not allowed: All notebooks call the same interpreter process to execute the code. That is to say, there is a common execution queue for all the paragraphs of all the notebooks. | Allowed: Each notebook has an individual process that executes the code. | Allowed (per user): Each user has an individual process that executes the code of all the notebooks belonging to that user, so two notebooks belonging to the same user cannot run in parallel but two notebooks belonging to different users can. |
Shared variables | All the variables will be accessible from all notebooks because they are all executed by the same process. This means zero security between notebooks. | The variables of each notebook will only be visible from that notebook. This means maximum security between notebooks. | The variables will be visible from every notebook of each user. This means medium security between notebooks. That is to say, consider a variable i = 6. All notebooks of user1 may change the value of the variable with i = 7, for example. |
Manager Configuration
The manager's instance configuration refers to how managers are executed within the installation.
For the Per Note and Per User configurations there are two manager instance possibilities:
- Isolated per note.
- Scoped per note.
These configurations refer to how the manager or managers will manage the context and send the code to the interpreters for its execution.
The features of each option are shown below.
Manager instance feature | Isolated per note | Scoped per note |
---|---|---|
Shared context | Non-shared context. Each notebook has a manager that manages a single context and sends the code to the interpreters. Variables saved in context are only accessible from within the same notebook. If the manager fails, it only affects the processes of that notebook. | Context shared between interpreters. Notebooks have a common manager that manages a context and sends the code to the interpreters. If this manager fails, it affects all notebooks. |
Visual schema
Visually, the structure corresponds to:
Globally shared | Per Note/ User Scoped | Per Note/ User Isolated |
---|---|---|