Streamsets version update in DataFlow

Available from version 3.0.0

In release 3.0.0, a new Dataflow image has been generated updating the Streamsets version from 3.10 to 3.18.1, which includes several improvements in the Streamsets library nodes, as well as bug fixes.

You can check the improvements from version 3.10 to 3.18 at the following link:

https://streamsets.com/documentation/datacollector/3.18.x/help/datacollector/UserGuide/WhatsNew/WhatsNew_Title.html#concept_hz3_5fk_fy

This new version includes the orchestrator library that will allow you to plan the execution of flows of the Dataflow instances that you have deployed in the platform. The library consists of the following nodes:

  • Cron Scheduler: This origin-type node periodically generates a record according to the schedule that is configured. For this schedule, a cron expression that can be manually entered or auto-generated in the node configuration UI is used.

  • Start Pipelines (origin and processor): These nodes allow to start one or more flows in parallel.

  • Wait for pipelines Processor: This processor type node waits for the flows it receives as input, to finish.

 

When upgrading the dataflow from a previous version to the new one, you may need to perform post-upgrade tasks, paying special attention to the following ones: