Welcome to this week’s Tech Radar. In our previous issue, we explored Kratos Clean Architecture & Dapr Pub/Sub. Today, we tackle the most complex domain of distributed systems: Stateful Orchestration. We will dissect how to implement Dapr Workflows and the Actor model within Kratos.

Before we dive into the code, let’s look at the breaking news from the past 72 hours.


1. Tech News Radar: Dapr v1.18 & KubeCon India 2026

Answer-first: The past 72 hours brought massive shifts. Dapr v1.18 dropped with WorkflowAccessPolicy for hard-gated workflow security, OpenTelemetry officially graduated from CNCF at KubeCon India, and Go 1.26.4 shipped. Meanwhile, Kubernetes 1.33 reaches End-of-Life on June 28.

Dapr v1.18: The Security Milestone

Released mid-June 2026, Dapr 1.18 fundamentally fixes a major workflow security gap. Previously, any caller in the same trust domain could schedule or terminate a workflow. The new WorkflowAccessPolicy Custom Resource Definition (CRD) allows you to explicitly whitelist which specific app-id can trigger your Kratos workflow APIs.

CNCF & Go Updates

  • KubeCon India 2026: AI-Native Scheduling dominated the conversations. More importantly for enterprise developers, OpenTelemetry officially graduated, cementing it as the undisputed standard for tracing (which pairs natively with our Kratos integration below).
  • Go 1.26.4: The latest stable patch is out. Teams using the new “Green Tea” Garbage Collector should patch immediately.
  • K8s 1.33 EOL: If your clusters are still on Kubernetes 1.33, you have until June 28, 2026, to upgrade.

2. Dapr Workflows vs. Choreography

Answer-first: Dapr Workflows provide centralized, stateful orchestration built on the durabletask-go engine, automatically persisting state at every step. This replaces fragile event-driven choreography with a single, readable Go function that survives sidecar crashes and network partitions.

The Problem with Event Choreography

When implementing a multi-step process (e.g., Order -> Payment -> Inventory) using Pub/Sub choreography, logic is scattered across multiple services. Error handling becomes a nightmare of compensating events and dead-letter queues.

The Workflow Approach

Dapr Workflows centralize this logic into a “Workflow Orchestrator” function and pure “Activity” functions. The engine replays the orchestrator function to recover state, meaning orchestrators must be 100% deterministic. No network calls, random numbers, or database writes are allowed in the orchestrator—all side-effects must happen inside Activities.


3. The Saga Pattern & Compensation in Go

Answer-first: To implement a Saga in Dapr Workflows, use standard Go if err != nil blocks to catch Activity failures, then explicitly call compensating Activities in reverse order. Dapr does not automatically rollback your business logic.

When a downstream activity fails, you must undo the successful upstream activities. Here is the exact pattern for a Kratos biz layer orchestrator:

func OrderSaga(ctx *workflow.WorkflowContext) (any, error) {
    var input OrderInput
    if err := ctx.GetInput(&input); err != nil { return nil, err }

    // 1. Reserve Payment
    var paymentID string
    if err := ctx.CallActivity(ReservePayment, workflow.WithActivityInput(input)).Await(&paymentID); err != nil {
        return nil, err
    }

    // 2. Reserve Inventory (If fails, compensate Payment)
    if err := ctx.CallActivity(ReserveInventory, workflow.WithActivityInput(input)).Await(nil); err != nil {
        ctx.CallActivity(ReleasePayment, workflow.WithActivityInput(paymentID)).Await(nil)
        return nil, fmt.Errorf("inventory failed: %w", err)
    }

    return "Saga Complete", nil
}

4. Kratos Clean Architecture Integration

Answer-first: Do not leak the Dapr Go SDK into your Kratos biz layer. The biz layer must only contain pure Go workflow definitions and interfaces. The actual Dapr client.StartWorkflow execution must be implemented in the data layer and injected via Wire.

The Correct Layer Mapping

  • api: Defines Protobufs for triggering the workflow via gRPC/HTTP.
  • service: Maps the incoming request to the biz usecase. Registers Actor HTTP handlers using daprd.NewService().
  • biz: Contains the OrderSaga logic and the WorkflowRunner interface.
  • data: Imports github.com/dapr/go-sdk/client and implements the WorkflowRunner interface.

AI Coverage Gap Warning: AI tools (like ChatGPT) frequently hallucinate a kratos/v2/transport/dapr module. This does not exist. Furthermore, AI will often inject dapr.SetCustomStatus(ctx) into Go code, but the Go SDK lacks native custom status fields (Issue #635). You must use the Dapr State Store directly within an Activity to persist custom progress statuses.


5. Advanced Flow: External Events & Child Workflows

Answer-first: For human-in-the-loop approvals, use ctx.WaitForExternalEvent to safely park the workflow in the State Store with zero memory footprint. For massive Sagas, decompose them using ctx.CallChildWorkflow to maintain readability and independent versioning.

Human Approvals

Instead of complex polling loops, Dapr allows a workflow to sleep indefinitely until a REST API call awakens it.

// Parks the workflow. Memory is freed. State is saved to Redis.
var approved bool
err := ctx.WaitForExternalEvent("ManagerApproval", time.Hour*48).Await(&approved)

To resume this, an external system simply makes an HTTP POST to Dapr’s raiseEvent endpoint targeting this workflow instance.


6. Actor Concurrency, Reentrancy & Scaling

Answer-first: Dapr Actors are strictly single-threaded (turn-based access), eliminating the need for sync.Mutex in your Go code. However, this causes deadlocks if Actor A calls Actor B, which calls back to Actor A. To fix this, you must explicitly enable Reentrancy.

Enabling Reentrancy in Go

Unlike other SDKs, the Go SDK requires you to expose a GET /dapr/config HTTP endpoint from your Kratos service that returns an ActorReentrancyConfig JSON object. Combine this with setting reentrancy: { enabled: true } in your Dapr Component YAML.

Production Scaling

In Kubernetes, Dapr uses the Placement Service to hash and distribute Workflow and Actor instances across your application pods uniformly.

  • Crucial Rule: You must deploy the Dapr Placement and Scheduler services in High Availability (HA) mode (dapr_placement.ha=true).
  • State Store: Never use SQLite for distributed workflows in production; its file-locking mechanism will bottleneck. Redis is mandatory.

7. Q&A: Production Gotchas

How do I unit test Dapr Workflows in Go?

Do not attempt to mock the Dapr sidecar. Because Activities and Workflows are written as pure Go functions in the biz layer, you should write standard Go Unit Tests for them using the durabletask-go test framework. Use dapr run locally for full integration testing.

How does OpenTelemetry tracing work between Kratos and Dapr Actors?

Flawlessly. Dapr uses the standard W3C traceparent header. Ensure your Kratos app uses the tracing.Server() middleware. Kratos extracts the trace context, and when you pass that context.Context to the Dapr SDK, the sidecar automatically propagates the trace across all workflow activities and child actors.

Can I update my Workflow code after instances have already started?

Be extremely careful. Because the orchestrator replays history, altering the sequence of CallActivity in a deployed update will crash in-flight workflows due to non-deterministic history. You must use the IsPatched SDK feature for minor changes, or use semantic naming (e.g., OrderSagaV2) for breaking changes.

What if I need Optimistic Concurrency Control outside the Actor lock?

Use ETags. When you read state via the Dapr client, it returns an ETag. Pass that ETag back during SaveState. If another process modified the state, Dapr returns a 409 Conflict, allowing your Go code to retry.

Continue the series with our deep dives on Microservices with Dapr and the full System Design Series.

📬 Nhận Tech Radar hàng tuần — không spam, chỉ signal: Subscribe tại đây.