Data Ingestion & Atomic Chunking Product Data

In Part 1: The Paradigm Shift - Agentic Architecture & Golang Orchestration Power, we established the Orchestration Engine using Golang and Eino. However, no matter how smart a brain is, it becomes useless if fed with misleading, unstructured, or fragmented information. In the e-commerce domain, product catalog data changes continuously every second: prices fluctuate, inventory is updated, new products are added. Meanwhile, chunking product data to feed into a Vector Database (Qdrant) is entirely different from chunking a PDF document or a news article. ...

May 22, 2026 · 8 min · Vesviet Team

Part 4: Streaming CDC & Federated RAG - Real-Time Knowledge

1. “Yesterday’s Data” is a Disaster If a customer asks a banking Chatbot about savings interest rates, and the Chatbot answers based on a PDF policy file that was changed… 2 hours ago. What happens? In Enterprise environments like Finance, Healthcare, or E-commerce, Yesterday’s data is a legal liability. Legacy data pipelines (ETL Batch Jobs running at midnight) no longer meet the demands of 2026. If the Core Database changes, your Vector Database must be updated immediately. Data Freshness must be measured in seconds. ...

May 17, 2026 · 4 min · Lê Tuấn Anh

Kafka Worker Pool in Go — Backpressure & Exactly-Once

Prerequisite: Part 5 of the System Design Masterclass. Read Part 4: Database Scaling to understand the storage tier that persisted events are written to. Answer-first: Event-Driven Architecture decouples services through asynchronous communication via a durable message log. In Go, goroutines and buffered channels implement natural backpressure — when consumers fall behind producers, the channel fills up and blocks the producer, throttling the ingest rate automatically. Kafka vs RabbitMQ — When to Use Each? Answer-first: Kafka is a distributed commit log — messages are retained indefinitely, consumers manage their own offsets, and replay is possible. RabbitMQ is a message broker — messages are deleted after acknowledgment, the broker handles routing complexity, push-based delivery. They solve different problems. ...

June 18, 2026 · 8 min · Tanh

Composable Banking Architecture: From Monolith to Modular Core

Answer-first: How banks replace monolithic cores (Temenos, Finacle) with composable banking using Go microservices, Saga orchestration, NewSQL ledgers, and Strangler Fig. Legacy core banking systems were designed in a different era. Temenos T24, Finacle, and Flexcube shared one defining assumption: the bank’s entire product catalogue — deposits, lending, payments, trade finance — would live inside a single, tightly coupled application and a single, shared database. That assumption held when banking moved at human speed. It breaks completely when product releases need to go from months to days, when a single fraud engine update must not risk a payments outage, and when engineers on a COBOL codebase are retiring faster than they can be replaced. ...

June 10, 2026 · 19 min · Lê Tuấn Anh

Chapter 4: Solving the Dual-Write Problem with Transactional Outbox Pattern

← Previous | Series hub | Next → Chapter 4: Eliminating the Dual-Write Nightmare When your Golang application migrates from a Monolith to Event-Driven Microservices, you will immediately face an architectural nightmare: the Dual-Write Problem. 1. What is the Dual-Write Problem? Answer-first: Dual-Write occurs when an app attempts to write to a Database and publish to a Message Broker (Kafka) simultaneously. Without a distributed transaction, network failures will cause the two systems to fall out of sync. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

Real-Time Inventory Synchronization: Kafka, CDC & Redis for E-commerce

What Is Real-Time Inventory Synchronization? Real-time inventory synchronization is the process of propagating stock count changes from the system of record (database) to all sales channels — web storefront, mobile app, WMS, ERP — in sub-second time. Instead of batch ETL jobs that run every hour, a CDC + Kafka pipeline streams every committed stock change as an event, eliminating overselling and stale stock displays. Handling this during a flash sale — where thousands of users attempt to purchase a highly contested SKU simultaneously — is a pinnacle architectural challenge. Traditional synchronous database updates collapse under lock contention. ...

June 8, 2026 · 6 min · Lê Tuấn Anh

Go Microservices Distributed Tracing Architecture (2026)

Monitoring complex Go microservices requires more than isolated logs. When a request traverses HTTP APIs, Kafka event streams, and asynchronous worker pools, you need absolute visibility to pinpoint latency bottlenecks and failures. By 2026, OpenTelemetry (OTel) has cemented itself as the vendor-neutral standard for telemetry. This guide explores the architecture of distributed tracing in Go, from SDK context propagation to advanced Collector Gateway configurations. The 2026 Paradigm: OpenTelemetry Pipeline Answer-first: Modern Go observability relies on a decoupled OpenTelemetry pipeline. Go SDKs generate OTLP data, local DaemonSet Agents handle low-latency batching, and centralized Gateways perform tail-based sampling and PII redaction before routing to backends like Tempo or Mimir. ...

June 8, 2026 · 5 min · Lê Tuấn Anh

PayPay Architecture: Scaling Payments to 70M Users

Answer-first: An in-depth look at PayPay’s engineering stack: handling 70M users and 7.8B transactions/year using TiDB, Kafka event sourcing, GitOps, and chaos engineering. PayPay launched in October 2018 and grew to 10 million users in just 3 months — a growth rate that no Japanese fintech had ever seen. By 2025, the platform had crossed 70 million registered users and processed 7.8 billion payments per year. Behind this growth is an engineering team that has had to scale not just their infrastructure, but their entire engineering culture: from service standardization and GitOps-driven deployments to chaos engineering and AI-powered fraud detection. ...

June 1, 2026 · 12 min · Lê Tuấn Anh

Real-Time Ride-Hailing Architecture: Uber & Grab Stack

Answer-first: How Uber and Grab handle millions of GPS pings/sec: H3 geospatial indexing, Kafka, DISCO matching engine, surge pricing, and RAMEN push notifications. The moment you open the Uber or Grab app, a cascade of real-time systems activates simultaneously: your phone begins transmitting GPS coordinates, a geospatial index updates your location, a matching engine re-evaluates nearby driver availability, a pricing model recalculates the fare based on supply-demand ratios, and a push notification pipeline prepares to deliver your match confirmation in under 3 seconds. ...

June 1, 2026 · 13 min · Lê Tuấn Anh

Mastering Event-Driven Architecture with Dapr Pub/Sub

Answer-first: Decouple a 21+ microservice ecosystem using Event-Driven Architecture. Ensure data consistency via Sagas, Dead Letter Queues, and Idempotent handlers. In my previous post, we explored how abandoning monolithic architecture in favor of strict Domain-Driven Design (DDD) bounded contexts allowed an e-commerce platform to scale beyond 10,000+ orders per day. However, splitting one big database into 20+ isolated Postgres databases introduces a terrifying new problem: How do we maintain data consistency across disconnected services? ...

April 12, 2026 · 15 min · Lê Tuấn Anh