The technology behind our Data Labeling Engine: Label Studio
In Platform Release 4.2.0 a data labeling engine has been integrated, which will allow labeling the information stored in the platform, specifically on the basis of the files (stored in the FileRepository or Platform MinIO) or the Entities stored in the platform repositories.
For this purpose, the Label Studio tool has been integrated.
Label Studio is an open-source data tagging tool. It allows you to label data types such as audio, text, images, videos and time series with a simple user interface and then export to various model formats.
It can be used to prepare raw data or enhance existing training data for more accurate ML models.
Its main features are:
Multiple data types, such as images, audio, text, HTML, time series and video.
Multi-user: with multi-user registration and login, when you create an annotation it is linked to your account
Multiple projects to work on all your datasets in a single instance.
Configurable label formats that allow you to customize the visual interface to meet your specific labeling needs.
Import from files or from cloud storage in Amazon AWS S3, Google Cloud Storage, or JSON, CSV, TSV, RAR and ZIP files.
Export through the label-studio-converter module, which is a library that can take Label Studio's internal JSON-based format and output to some general purpose formats (JSON, CSV, TSV) or to model-specific formats such as CONLL for textual labelers or Pascal VOC or COCO for computer vision models.
Integration with machine learning models to visualize and compare predictions from different models and perform pre-labeling using the Label Studio SDK:
API Rest to incorporate it into your data pipeline.
Templates for labeling:(Label Studio — Gallery of Labeling Templates )Label Studio includes a variety of templates to assist in labeling data in addition to allowing you to create your own using a specifically designed configuration language. The most common templates and use cases for labeling include the following cases:
Comparison of Predictions:
Incremental tagging: starting with a small number of attributes and adding more over time.
Large community on Github: GitHub - HumanSignal/label-studio: Label Studio is a multi-type data labeling and annotation tool with standardized output format
Multiple ways to install it: including deployment on Cloud providers