Leoflow: a GitOps workflow orchestrator built on Go containers
Leoflow splits its architecture into three Go binaries — leoflow, leoflow-server, and leoflow-agent — each with its own main.go under cmd/. That separation matters. The scheduler, the API surface, and the task executor are independent processes that can run in different containers, on different machines, communicating over RPC. If you’ve ever tried to decompose a monolithic orchestrator after the fact, you know how painful that is. Leoflow starts decomposed.
Here’s how the project uses Go’s strengths to build a container-native, GitOps-first workflow engine that stays compatible with the Airflow UI.
Three binaries, three responsibilities
The project’s cmd/ directory lays this out clearly:
cmd/leoflow/main.go— the CLI and scheduler entry pointcmd/leoflow-server/main.go— the HTTP API servercmd/leoflow-agent/main.go— the remote agent that runs tasks
This is a pattern you see in well-structured Go projects. Each binary imports from internal/, keeping shared logic private to the module. The Go compiler only includes what each binary needs, so leoflow-agent doesn’t carry the weight of the API server’s routing code.
If you want to understand why Go projects split binaries this way, the reasoning mirrors what I covered in context handling across services — each process has its own lifecycle, its own cancellation tree, and its own reason to exist.
The agent: RPC, auth, and container execution
The agent is where tasks actually run. The internal/agent/ package breaks this down into focused files:
internal/agent/auth.go— handles authentication between the agent and serverinternal/agent/dial.go— manages the connection setupinternal/agent/exec.go— executes tasks (container commands)internal/agent/runner.go— orchestrates the task lifecycleinternal/agent/command.go— builds the commands to execute
The server-side counterpart lives in internal/agentrpc/server.go, which exposes the RPC endpoint that agents connect to.
This split between internal/agent/ (client-side) and internal/agentrpc/ (server-side) is a clean Go pattern. The agent dials into the server, authenticates, and then receives work. dial.go handles connection setup; auth.go manages token or credential exchange. Networking stays isolated from execution.
The exec.go and runner.go files handle the actual task execution path. In a container-native orchestrator, “running a task” means starting user code, capturing stdout/stderr, respecting cancellation, and reporting the exit code back. Go’s os/exec package and its tight integration with process management make that work readable instead of mystical.
The API server: Airflow UI compatibility
Leoflow’s API server lives in internal/api/ and it’s packed with files. The naming tells you a lot about what’s going on:
internal/api/server.go— server setup and routinginternal/api/middleware.go— request middleware (auth, logging, etc.)internal/api/ui_dags.go— serves DAG data to the Airflow UIinternal/api/ui_dashboard.go— dashboard endpointsinternal/api/ui_details.go— task detail viewsinternal/api/ui_dagversions.go— DAG version historyinternal/api/ui_executor.go— executor status for the UIinternal/api/ui_connections.go— connection management UI endpointsinternal/api/ui_audit.go— audit log endpoints
The ui_* files are the interesting ones. Airflow’s web UI expects specific API shapes — endpoints for DAGs, task instances, DAG runs, connections, and so on. Leoflow implements these same endpoints so the existing Airflow UI can talk to a Leoflow backend without modification.
I think this is the smartest design decision in the project. Instead of building a new UI from scratch (a massive time sink that rarely pays off), they implement a compatible API surface. In Go, this means writing HTTP handlers that return JSON matching Airflow’s expected schema. The internal/api/dto.go file contains the data transfer objects — Go structs with JSON tags that match the Airflow API contract.
There’s also internal/api/api_stubs.go, which provides placeholder responses for Airflow API endpoints that Leoflow doesn’t fully implement yet. Pragmatic: return valid responses so the UI doesn’t break, then fill in real implementations over time.
The internal/api/problem.go file suggests the API uses RFC 7807 problem details for error responses. Standardized error formats make debugging much easier when you’re running distributed services, and I wish more projects did this from the start rather than bolting it on later.
Observability and health
Two files stand out for operational concerns: internal/api/health.go and internal/api/observe.go.
Health endpoints are table stakes for anything running in Kubernetes. They let the scheduler know if the server is ready to accept traffic. In Go, this is typically a simple HTTP handler that returns 200 if the service is healthy. If you’re deploying Leoflow in Kubernetes, these endpoints feed into liveness and readiness probes.
observe.go wires the API’s observability endpoints. The repo also carries ADRs for structured logs, Prometheus metrics, and OpenTelemetry traces. For a workflow orchestrator, you want to track task duration, queue depth, and failure rates. Without those, you’re flying blind when something stalls at 3 AM.
The internal/api/logs.go file handles log streaming. When a task runs inside a container on a remote agent, the logs need to flow back to the server and then to the UI. Go’s io.Reader and io.Writer interfaces make piping logs between services clean — you can stream without buffering the entire output in memory, which matters when tasks produce large outputs.
The monitor pattern
internal/api/monitor.go suggests a background goroutine that watches for state changes. In workflow orchestrators, you need something that periodically asks: has a task timed out? Has an agent gone offline? Should the next task in a DAG be triggered?
In Go, this is typically a goroutine running a select loop with a time.Ticker and a context.Done() channel. The monitor watches for changes and triggers actions — the same concurrency pattern you’d use in any long-running Go service. If you want to see how Go handles cancellation and timeouts in these scenarios, check out how functional options help configure components like monitors with sensible defaults.
GitOps-first: DAG versioning
The internal/api/ui_dagversions.go file reveals something about the GitOps model. In a GitOps workflow, DAG definitions live in a Git repository. When you push a change, the orchestrator picks up the new version. Leoflow tracks these versions so you can see which version of a DAG ran at what time and roll back if needed.
For Go developers, this means the server needs to watch a Git repository (or receive webhooks), parse DAG definitions, and diff them against the current state. Go’s standard library has no built-in Git support, so projects typically use libraries like go-git or shell out to the git binary.
Why Go for workflow orchestration
Go is a natural fit here, and it’s not hard to see why:
-
Static binaries — Each of the three binaries compiles to a single file. No runtime dependencies. Drop it in a container image based on
scratchordistrolessand you get tiny images. -
Goroutines for concurrency — A workflow orchestrator manages many tasks simultaneously. Go’s goroutine model means you can run thousands of concurrent operations without thread pool tuning.
-
Strong standard library — HTTP servers, JSON marshaling, process execution, and context propagation are all in the standard library. Leoflow doesn’t need heavy frameworks for its core functionality.
-
Fast compilation — GitOps means frequent rebuilds. Go compiles fast, so CI pipelines stay short.
The clean separation between internal/agent/, internal/agentrpc/, and internal/api/ is worth studying if you’re building services that communicate over RPC and manage container lifecycles. It shows how to structure a Go project with multiple deployment targets without letting the packages bleed into each other.
Where this gets interesting next
Leoflow takes an opinionated approach: GitOps for DAG definitions, containers for task execution, and Airflow UI compatibility so you don’t have to retrain your team. The Go implementation is split across three binaries with clean internal package boundaries. The agent handles remote execution, the server provides both an API and Airflow-compatible UI endpoints, and the CLI ties it together.
What I’m most curious about is how well the Airflow UI compatibility holds up as both projects evolve. That’s the risk with API-compatible reimplementations — you’re always chasing a moving target. But for teams that already have Airflow muscle memory and want something lighter underneath, this is a bet worth watching. The repo has the details.