REST API to automate DataRefiner processes
Descripción
The platform apis continues to be expanded with one to be able to access the DataRefiner component, continuing with the incorporation of Data Governance capabilities in the Platform, this service has been included, which allows data to be loaded from files in different formats (XL, CSV, XML, JSON) from my PC, work with them to clean, improve, restructure or reconcile them and return them as a file in the desired format.
Example of use
The idea of use is to be able to treat a batch of files with the same field structure. As an example, an excel file will be treated, which can be downloaded here, then it would only be necessary to repeat the same operation with the rest of the batch, just changing the file in each execution:
Â
Â
The first column of all the records will be changed to the batch of files to put the text in capital letters, it is one of the many possible data transformations.
This is the structure of the excel to be treated in the example:
To test it, we access the control panel apis
Â
It is selected in swagger Data Refiner
Â
In this option, click on the Try it out button
And the fields are filled:
Authorization: with the oauth 2 token that can be obtained here if we access from the console or from the Authorization api if it is accessed through the platform's API:
Operations: this field will be filled in with the transformation or treatment operations that are intended to be given to the data in this case:
{
"op": "core/text-transform",
"engineConfig": {"facets": [],"mode": "row-based"},
"columnName": "Region",
"expression": "value.toUppercase()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Region using expression value.toUppercase()"
}
Note: to obtain these operations, they can be obtained starting from the OpenRefine documentation or previously creating a project, performing the transformations on the file and clicking on the Extract button, with this they are obtained in json format and allow them to be applied later to files with the same structure .
exportType:
Here you enter the format of the resulting file, these are the possible formats: csv tsv xls xlsx ods html txt
For the example html has been selected
file:
A file of the available formats will be attached:
'text/line-based’: Line-based text files
'text/line-based/*sv’: CSV / TSV / separator-based files [separator to be used in specified in the json submitted to the options parameter]
'text/line-based/fixed-width’: Fixed-width field text files
'binary/text/xml/xls/xlsx’: Excel files
'text/json’: JSON files
'text/xml’: XML files
In this case of the example type excel.
After pressing EXECUTE button
we will obtain the html file with the operations performed on it, being able to download it
When we open it, we check that for the first column all the text in each cell has been transformed to uppercase
Â