How to use the DataRefiner to ingest a file?
EN | ES
Introduction
This platform module can be accessed by the ADMINISTRATOR and ANALYTICS roles, from the option
Upload file to DataRefiner
Once I access the module from the main page I will select the option
which will enable me the option to load a file:
For the file upload example I will use this:
https://www.iwight.com/documentlibrary/download/transparency-data-april-2013-csv
So I'll select this file as URL and hit Next:
Transforming the loaded data
Once the file is loaded I will find a UI like this, from which I can do some transformations prior to loading the file in my project
For example I can ignore the first line if it has the headers or discard blank rows:
Or try transforming the columns to their type ... we are not going to do this because we will do it at a later stage.
We will conclude by creating the project:
Once the project is created I find this UI:
Where I can see the first records of the file / project, in this example we have 11285 records.
We also see that each column has a drop-down menu and that a box appears to the left to be able to go back and see the Facets.
If I go to the Amount column I can see the options I have: Facet, Filters, edit cells, columns, transpose, sort, reconcile.
I'll start by saying that this column is of type numeric in Edit cells> Common transforms> to number
Now I will see the data formatted as numbers, however this does not mean that they are all numbers,
now I will apply a Facet by selecting Facet> Numeric Facet
In the left part I will see that I have 2244 data that is not numeric:
If I just leave the Non-numeric check in my table I will see 10 of these records. As you can see, the problem is that there is a comma to separate the thousands and the tool cannot transform it to a number:
So I can transform it with the tool and its GREL (Google Refine Expression Language) language. I will go to the Amount column, select Edit cells> Transform and in the transformation I will indicate value.replace (',', '')
If I apply the filter and apply Edit cells> Commons Transforms> To number I can see since all the data in my column is numeric.
I can apply other more advanced transformations, such as deriving a column from another applying transformations, for that I will put myself in the Date column, I will select Edit column> Add column based on this column
In this one I will apply the date format transformation:
value.toDate ('dd.mm.yy'). toString ('yyyy-mm-dd')
And then I'll apply Edit cells> Commons Transformations> To date
I can also combine data, reconcile it, ...
Extract transformations
A very interesting concept is that I can extract the transformations to be able to apply them on other data of the same type (the Excel of the following year in this example), this is done from Undo-redo / Extract
The result is a JSON that I can save:
And that from the Apply option I can paste and apply to other files:
Export data
Finally we will see how to export the data of my project. From the Export option I can export in numerous formats: