Workflows as functions

How a functional dataflow language can simplify the expression of workflows

Jan 03, 2025

Workflows have traditionally been considered distinct from functions, primarily because workflows aren't typically thought of as 'returning' values. However, this distinction may be artificial - workflows do produce outputs by arranging the global state of a system. Every task that a person performs within an organization is meant to arrange that organization's global state in some way. When you execute a recipe, you first gather the ingredients. Why? So you have the ingredients. When you boil the potatoes, you are doing that so you can have boiled potatoes, which will then be used for another task. If your friend decided to show up with boiled potatoes right before you started executing that recipe, you don’t need to buy or boil the potatoes.

So rather than viewing task sequences as a list of instructions, it’s better to think of them as a tree of dependencies in which the child task ends up modifying the context of the parent task, allowing that parent task to then continue execution. The whole existence of the subtask is about preparing a certain type of state which is required for the parent task to execute. That execution could be manually done by a human, or it could be automatically done. This has been implicitly discovered with things like Apache Airflow, Temporal, and other modern workflow engines, but their approach focuses on running existing code, not on gathering human inputs.

Moreover, once you have a tree or Directed Acyclic Graph (DAG) representing the data dependencies, you can store both the tasks and their results in the DAG, using data structures defined by data constructors. Now you have the basis for a dataflow language. However, by focusing on tasks as well, to be executed by agents, you also have a Hierarchical Task Network (HTN), which can be hybridized with LLM’s to provide the dependability of an HTN when necessary with the creativity of an LLM when exploring new domains. Both evaluation semantics and tasks are simply graph rewrites.

Tasks as functions

If a workflow is like a function, what does it mean to apply it? An applied function creates expressions, in other words an edge in the node representing the containing scope. The instantiation of a workflow creates a task. Mapping the analogy, a task is an expression. Once “evaluated”, the task produces the return type of the function. This return type would include any requirements for the task to be complete, including confirmations, QA checks, etc. The DAG essentially then represents a superposition of call stacks which are simultaneously evaluated, allowing for parallel and distributed execution..

Another aspect of treating workflows as functions is resource allocations. These would be treated as arguments to the functions. Using the recipe example from above, the function for mashing boiled potatoes is waiting for those boiled potatoes to exist in the global context. Once the potatoes are available, if there’s an expression running which is going to use them, there are a couple options. (1) it can acquire a lock on the potatoes immediately. This ensures that the workflow can run all the way through once initiated. (2) it can wait until the subtask that needs the potatoes is running to acquire the lock. This may result in a delay if potatoes aren’t available.

Rewrites carrying out evaluation semantics would be defined by the programming language interpreter. Other agents doing other kinds of rewrites would be effectful. For instance, the UI-agent that allows users to directly modify the graph is executing something akin to an IO effect. Just like a Haskell program which is contained within an IO monad instance, any user interaction with the graph would be mediated through an IO agent attached to the graph, which would provide IO agent rewrites.

Advantages versus existing workflow systems

While a workflow system should be able to skip sub-tasks once their required state is achieved, this becomes awkward in traditional programming language runtimes. The process would need a way to reach into the call stack, cancel evaluation, and return up to the containing scope. This might require something like delimited continuations. Or, it could reset and rerun instructions without effects to get back to the correct state to continue evaluation. This can be error prone and in a language that doesn’t specify effects might result in unintended effect behavior unless explicitly suppressed. However we can make it less awkward with a dataflow programming language. If user input isn’t in the DAG, simply don’t evaluate. If the required state of the containing scope is achieved, simply pass the expressions to its parent scope.

In traditional systems, this would be difficult to manage and dispatch. In Fosforescent, these allocations would be specified as arguments to the workflow. As such those different evaluation tactics would be able to be expressed as a basic part of the operational semantics, e.g. passing the allocation argument as an argument versus a future.

Ultimately, expressing both tasks and functions in a single graph enables treating human tasks and their associated resources as first-class citizens. This should allow a cleaner and more maintainable specification of business logic, as well as easier integration.

Fosforescent

Workflows as functions

How a functional dataflow language can simplify the expression of workflows

Tasks as functions

Advantages versus existing workflow systems

Discussion about this post