Active RAG & Strict Tool Calling With Real-time APIs

In Part 3: Qdrant Hybrid Search - Solving Semantic and Hard Filters, we successfully built a powerful Hybrid search engine combining Dense Semantic and Sparse Lexical Search. However, a practical e-commerce search system goes far beyond merely retrieving static documents from a vector database. For example, a user asks: “I want to buy a 400L Samsung Inverter refrigerator available at the District 1 branch that has an active promotion.” If we rely solely on a Vector Database, we face two critical errors: ...

May 22, 2026 · 8 min · Vesviet Team

Tech Radar 22/06: Dapr v1.18, Kratos Clean Architecture & Master-Level Workflows

Welcome to this week’s Tech Radar. In our previous issue, we explored Kratos Clean Architecture & Dapr Pub/Sub. Today, we tackle the most complex domain of distributed systems: Stateful Orchestration. We will dissect how to implement Dapr Workflows and the Actor model within Kratos. Before we dive into the code, let’s look at the breaking news from the past 72 hours. 1. Tech News Radar: Dapr v1.18 & KubeCon India 2026 Answer-first: The past 72 hours brought massive shifts. Dapr v1.18 dropped with WorkflowAccessPolicy for hard-gated workflow security, OpenTelemetry officially graduated from CNCF at KubeCon India, and Go 1.26.4 shipped. Meanwhile, Kubernetes 1.33 reaches End-of-Life on June 28. ...

June 22, 2026 · 6 min · Lê Tuấn Anh

Tech Radar 17/06: Kratos Clean Architecture & Dapr Pub/Sub

Welcome back to the Tech Radar bulletin. Last week we dissected how Kratos and Dapr v1.15 solve State Collisions via ETags. This week we go one layer deeper: how do you structure the entire codebase so that Kratos, Wire, and Dapr Pub/Sub compose cleanly — and how do you keep that architecture testable, resilient, and production-safe? 1. The Four Layers of Kratos Clean Architecture Answer-first: Kratos enforces a four-layer Clean Architecture — api, service, biz, and data — where business logic in biz is completely isolated from transport and infrastructure. Each layer communicates only with the layer adjacent to it, and only through interfaces. ...

June 17, 2026 · 6 min · Lê Tuấn Anh

Zero DevOps E-commerce with Cloudflare Workers & Turborepo

Tired of maintaining expensive Kubernetes clusters, fine-tuning Auto-scaling groups on AWS, or wiring together complex CI/CD pipelines just to keep an e-commerce store alive? Welcome to the Zero DevOps era. In this post, we dissect Aura Store — a production-grade Cloudflare Workers E-commerce platform built entirely on Edge infrastructure, powered by a Turborepo Monorepo. Everything you see below is drawn directly from the running codebase. 1. Turborepo Monorepo Architecture in Practice Answer-first: Turborepo splits the e-commerce system into four independent apps — storefront-ui, admin-ui, public-api, and admin-api — and two shared packages: database and contract. This separation maximises build speed via Turborepo’s task graph cache and enforces a hard security boundary between the public-facing layer and internal tooling. ...

June 17, 2026 · 6 min · Lê Tuấn Anh

Go Microservices Architecture: Production Guide

Go microservices from domain design to Kubernetes deployment — gRPC, Dapr, OpenTelemetry, and GitOps patterns from a real 21-service production migration.

June 12, 2026 · 23 min · Lê Tuấn Anh

Golang gRPC Microservices: Production Guide with Protobuf, TLS & Middleware

Why gRPC for Go Microservices? Answer-first: gRPC is the right choice for Go microservices when you need: binary-efficient serialization (Protobuf is 3–10× smaller than JSON), bidirectional streaming for real-time data, strongly-typed contracts across services, and sub-millisecond inter-service latency. Google, Uber, Netflix, and Square use gRPC as the primary inter-service communication protocol. This guide shows you how to build production-grade Go gRPC services from scratch. The key advantages over REST: gRPC REST/JSON Serialization Protobuf (binary, schema-enforced) JSON (text, schema-optional) Payload size 3–10× smaller Baseline Streaming Unary, Client, Server, Bidirectional HTTP/2 SSE (server-only), WebSocket (separate) Contract .proto file (language-agnostic codegen) OpenAPI (opt-in, often stale) Latency ~0.5ms p50 inter-service ~2–5ms p50 inter-service Browser support gRPC-Web (needs proxy) Native Best for Internal microservices, streaming Public APIs, browser clients Step 1: Define Your Service with Protobuf Create the contract first — Protobuf schema drives code generation for all languages. ...

June 11, 2026 · 12 min · Lê Tuấn Anh

Composable Banking Architecture: From Monolith to Modular Core

Answer-first: How banks replace monolithic cores (Temenos, Finacle) with composable banking using Go microservices, Saga orchestration, NewSQL ledgers, and Strangler Fig. Legacy core banking systems were designed in a different era. Temenos T24, Finacle, and Flexcube shared one defining assumption: the bank’s entire product catalogue — deposits, lending, payments, trade finance — would live inside a single, tightly coupled application and a single, shared database. That assumption held when banking moved at human speed. It breaks completely when product releases need to go from months to days, when a single fraud engine update must not risk a payments outage, and when engineers on a COBOL codebase are retiring faster than they can be replaced. ...

June 10, 2026 · 19 min · Lê Tuấn Anh

Chapter 4: Solving the Dual-Write Problem with Transactional Outbox Pattern

← Previous | Series hub | Next → Chapter 4: Eliminating the Dual-Write Nightmare When your Golang application migrates from a Monolith to Event-Driven Microservices, you will immediately face an architectural nightmare: the Dual-Write Problem. 1. What is the Dual-Write Problem? Answer-first: Dual-Write occurs when an app attempts to write to a Database and publish to a Message Broker (Kafka) simultaneously. Without a distributed transaction, network failures will cause the two systems to fall out of sync. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

The Reality of C10M: Surviving Extreme Traffic — Exec Summary

Despite the massive advancements in cloud computing, enterprise applications facing explosive traffic growth inevitably hit a brutal wall: the Database and the Network layer. The root cause lies not in the hardware, but in the Architecture. We attempt to solve the “Millions of Requests per Second” (C10M) problem by simply throwing more servers at it (Vertical/Horizontal Scaling), only to realize that stateful bottlenecks, cache stampedes, and dual-write inconsistencies bring the entire cluster to its knees. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

Go Microservices Distributed Tracing Architecture (2026)

Monitoring complex Go microservices requires more than isolated logs. When a request traverses HTTP APIs, Kafka event streams, and asynchronous worker pools, you need absolute visibility to pinpoint latency bottlenecks and failures. By 2026, OpenTelemetry (OTel) has cemented itself as the vendor-neutral standard for telemetry. This guide explores the architecture of distributed tracing in Go, from SDK context propagation to advanced Collector Gateway configurations. The 2026 Paradigm: OpenTelemetry Pipeline Answer-first: Modern Go observability relies on a decoupled OpenTelemetry pipeline. Go SDKs generate OTLP data, local DaemonSet Agents handle low-latency batching, and centralized Gateways perform tail-based sampling and PII redaction before routing to backends like Tempo or Mimir. ...

June 8, 2026 · 5 min · Lê Tuấn Anh