[tech] Workflow basics with AWS Step functions
I’ve been working on a few systems that are modeled as workflows and wanted to capture notes on the topic. This is likely going to be a short series that explains basics and then some advanced patterns when using workflows. In particular I’ll be using AWS Step functions to share examples.
Workflows
A workflow is defined in wikipedia as-
A workflow consists of an orchestrated and repeatable pattern of activity, enabled by the systematic organization of resources into processes that transform materials, provide services, or process information
This general definition can be used across various industries.
For software systems we can consider a similar definition around input and output of data rather than materials. A workflow consists of an orchestrated and repeatable pattern of complex activities that transform data or integrate with other systems that are synchronous or asynchronous.
Asynchronicity is a strong indicator of a complex operation requiring a workflow. Simpler operations like fetching data from a database are usually synchronous and are not workflows. These are also latency sensitive operations while workflows usually last for longer time frames from a few seconds to even days. Workflow services usually provide multiple integrations so that new workflows can be easily built with other services or even delegate to humans for getting input. Workflows can also integrate with other workflows.
The workflow above shows a stock trading workflow. It is actually a visual representation of an executable workflow in the AWS solution - step functions.
When the workflow is executed from start, you can imagine the transition of the workflow from one step or activity to another. On completion of each activity the state of the system can be captured. Workflows also lend themselves well to orchestrating state-transitions of a system.
AWS Step functions
Let’s continue looking at workflows as implemented in AWS Step Functions. Step Functions allow creation of workflows (called state machines). Each State Machine is made up of units of work called states. In the diagram above, a single state machine is depicted with multiple tasks. Individual states can make decisions based on their input, perform actions, and pass output to other states.
States can be of multiple primitive types-
Choice - Similar to a switch statement with conditions. Boolean, numeric, String and date conditions are possible.
Fail or Succeed - to reach terminal states in a workflow
Pass - For injection of fixed data or comment or basic input to output transformation
Wait to add delays
Parallel - For multiple branches of execution of different states with the same input.
Map - For iteration of items in an array and applying a state to each item.
Tasks - Represents a single unit of work performed by a state machine. Work is done by
Invoking a lambda function,
Calling another AWS service integration
Performing an activity. An activity is a custom mechanism for a worker hosted in a different compute compared to lambda
Common fields in a state are
Name
Type
Next - reference to the next state (except for terminal states) by name
Comment (optional)
InputPath - path to select input to be used for processing
OutputPath - path to select output to be passed to next state transition
Data is received as JSON input for the state machine and it is passed along the state transitions. Each state may manipulate the data till it reaches a terminal state. For each execution of a state machine, the input and output of the complete state machine and each intermediate state transition are also captured. This is similar to a cheap transaction log or debug view of the execution.
State Machines by example
The example above is a toy workflow to show some of the basic concepts of a workflow. The entire code of the workflow is available at github. The code is based on the JSON-based language called Amazon States Language.
The first state is a “Fibonacci generator” which calls a lambda synchronously. The output of the lambda is passed to the “Mapper” state. In the lambda, an array of numbers is returned which become input to the next state. The type of the State is Task and it is configured with a resource for lambda invocation as "arn:aws:states:::lambda:invoke”. The parameters for the task then include the function ARN to be invoked assuming that the correct IAM permissions are set up. State input becomes lambda input and the lambda output becomes state output with optional additional filtering. In one execution the data output is the json array [0, 1, 1, 2]. The entire code for the workflow also includes the default retry policy to call a lambda.
The next state is the Mapper which then calls the Adder state for each item in the array like a for loop in typical programming languages. The Added is a Pass type state but it transforms a number by adding a value of 7 and sets the output in a new JSON path called added.
In this contrived example, I showed a basic lambda integration with loops and simple data transformation. It is possible to build these workflows with a visual editor that generates the code behind the scenes. These workflows become the glue in complex asynchronous tasks in various application integration patterns. Reliable workflow execution becomes a valuable primitive for designing distributed systems.
Next
Continue reading on the advanced topics below to learn more.
Next in series