v0.6.0 Release: Simplified Deployment And Improved Error Management
Building Infinitic, an event-driven orchestration engine providing reliable and scalable workflows, even in distributed environments.
Hi!
It has been a while since my last email, but a lot happened meanwhile:
Infinitic v0.6.0 is out with simplified deployment and improved error management.
Infinitic has had its first formal load testing during the last Pulsar Hackathon.
Infinitic will be at the next Pulsar Summit North America!
Simplified deployment
You may remember that Infinitic is based on engines (workflow engines, task engines, tag engines…) you need to deploy along with some workers actually processing tasks and workflows. Well, since Infinitic v0.6, the deployment is greatly simplified with only task workers and workflow workers.
Since v0.6, the task and workflow engines are embedded per default into task workers and workflow workers. And each task and workflow have their own engine instances. The first motivation to do that was to optimize the flow of messages into Pulsar: a large afflux of messages for a specific workflow should not delay another workflow.
Incidentally, it provided the opportunity to simplify the deployment by embedding everything a task/workflow needed into a task/workflow worker. The Pulsar topics architecture is still the same:
But now, a lot of this complexity is internal to workers.
Error management
Up to now, a task could not fail in a workflow. Of course, a task could fail, but Infinitic would automatically retry it to its completion and the workflow's resume. It implied that a workflow could be stuck forever if a task failed unrecoverably.
I recognize that there are situations where workflow needs to continue even if a task could not complete. So, from v0.6, you can catch the task failure directly within the workflow code to react to this situation. It is quite a sophisticated piece of code; also, I recommend you look at the documentation.
A pleasant consequence of this new feature is that a workflow now raises an exception when stalled due to a task failure. And this exception recursively contains the reason for the chain of failures. It means you can easily find the root cause of the issue (that could be a failed task in a child workflow, for example). Debugging an event-driven architecture is notoriously tricky, and this new feature will tell you where exactly the root failure occurred in your distributed infrastructure.
Load testing Infinitic during the last Pulsar Hackaton
With Matthieu Jacquet (from marketing company Splio) and John Kinson, we have participated in the last Pulsar hackathon. We decided to build the prototype of a bench to load-test Infinitic. During this 48h hackathon:
We have written a workflow launcher to be able to dispatch workflows according to a scenario defined in a configuration file;
We have written a workflow, emulating the request of 2 different providers for a product, followed by an order ordering to the quickest one responding positively;
We have set up a local Docker configuration to run Prometheus and Grafana to get nice dashboards;
We were able to run all developments above on a hosted Pulsar provided by CloudNative;
We did a nice 10' min video of this work that I hope to release soon.
I'm thrilled with the results, as we were able to reach consistently a completion rate of 20 workflows/second (nearly 2 million workflows per day) with a unique worker (basic Macbook Pro) and a minimal Pulsar cluster.
Workflow engines often have a bad reputation for being slow and a single point of failure. Event-driven workflow engines like Infinitic are pushing those limits. I really believe that this technology will be more and more common in the coming years.
Pulsar Summit
Last but not least, I'm thrilled to share that my paper "Infinitic: Building a Workflow Engine on Top of Pulsar" has been accepted for the next Pulsar Summit:
This is a great event for anyone interested in messaging and event streaming to learn the latest Pulsar project updates, use cases, and best practices! I can’t recommend enough you to join :)
That’s it for today. Be safe.