Interpreter configuration in the Notebooks module

The platform's Notebooks engine is based on Zeppelin. In Zeppelin, support of the different languages ​​is given through Interpreters.

Zeppelin structure

To understand how Zeppelin works, its structure must be considered (Zeppelin structure explanation).

Zeppelin has three main components:

  • Notebooks.
  • Managers (Java virtual machine that manages executions and where the context is located).
  • Intérpretes for each language.

When a paragraph is executed, the flow that is followed is:

  1. The notebook sends the code to the manager.
  2. This manager manages the context and sends the code to the corresponding interpreter to execute the code.
  3. The interpreter executes the code and returns the result to the manager.
  4. The manager returns the result to the notebook.

Managers and interpreters can be configured (Interpreter Binding Mode) depending on the use that is given to them (a single Data Scientist, several ones, ...) and the available infrastructure.

The meaning of these configurations is explained below.

Interpreter Settings

The interpreter instance configuration refers to how the interpreters are executed within the installation.

Each interpreter corresponds to a process running in the system, which can be created globally (single process), by the user (one process by each user) or by a note (one process per notebook).

The features of each option are shown below.

Features per interpreter instanceGloballyPer NotePer user
Parallel execution

Not allowed: All notebooks call the same interpreter process to execute the code. That is to say, there is a common execution queue for all the paragraphs of all the notebooks.

Allowed: Each notebook has an individual process that executes the code.

Allowed (per user): Each user has an individual process that executes the code of all the notebooks belonging to that user, so two notebooks belonging to the same user cannot run in parallel but two notebooks belonging to different users can.

Shared variablesAll the variables will be accessible from all notebooks because they are all executed by the same process. This means zero security between notebooks.The variables of each notebook will only be visible from that notebook. This means maximum security between notebooks.

The variables will be visible from every notebook of each user. This means medium security between notebooks. That is to say, consider a variable i = 6. All notebooks of user1 may change the value of the variable with i = 7, for example.


Manager Configuration

The manager's instance configuration refers to how managers are executed within the installation.

For the Per Note and Per User configurations there are two manager instance possibilities:

  • Isolated per note.
  • Scoped per note.

These configurations refer to how the manager or managers will manage the context and send the code to the interpreters for its execution.

The features of each option are shown below.

Manager instance featureIsolated per noteScoped per note
Shared context

Non-shared context. Each notebook has a manager that manages a single context and sends the code to the interpreters. Variables saved in context are only accessible from within the same notebook. If the manager fails, it only affects the processes of that notebook.

Context shared between interpreters. Notebooks have a common manager that manages a context and sends the code to the interpreters. If this manager fails, it affects all notebooks.


Visual schema

Visually, the structure corresponds to:

Globally sharedPer Note/ User ScopedPer Note/ User Isolated