The technology behind the DataRefiner: Open Refine

Our DataRefiner is based on the OpenRefine software to which a set of extensions have been added to work with the platform.

Open Refine is an open-source Java tool (BSD-3 license) based on Java that with an Excel-style web interface allows you to load data from different sites and in different formats, understand it, clean it, reconcile it and improve it.

First, it must be noted that the concept of OpenRefine is that you can do the transformations from your own computer, only that instead of using a rich client application you do it from your browser (although as always, there are ways to take this concept to the Cloud).

OpenRefine is found on github: https://github.com/OpenRefine

To find more information you can access their wiki:

https://github.com/OpenRefine/OpenRefine/wiki/Documentation-For-Users

A little history

When Google gave the software to the community, it was difficult for them to start, so that you can get an idea:

https://github.com/OpenRefine/OpenRefine/releases

YEAR

VERSION

DETAILS

2013

Google Refine 2.5

Latest version with Google Branding

2015

Open Refine 2.6-RC1

It takes 2 years to generate a Release Candidate from which no final version came out

2017

Open Refine 2.7 Release

Open Refine 2.8 Release

We finally have a release, well 2

2018

Open Refine 3.0 Release

Open Refine 3.1 Release

It has been 5 years until there is a major version of Open Refine

2019

Open Refine 3.2 Release

 

2020

Open Refine 3.3 Release

 

The current version is 3.4.1 (released at the end of September 2020.

As you can see from 2018/2019, you can see that the project has been reactivated.

How to install OpenRefine

As we have said, OpenRefine is designed to be used on the local PC, so to use it you just have to download the distribution for your operating system.

On the releases page you will find the installers for each one: https://github.com/OpenRefine/OpenRefine/releases/tag/3.4.1

Once the software is downloaded and the executable has been launched, a browser will open at http://127.0.0.1:3333 with this aspect: