Active RAG & Strict Tool Calling With Real-time APIs

In Part 3: Qdrant Hybrid Search - Solving Semantic and Hard Filters, we successfully built a powerful Hybrid search engine combining Dense Semantic and Sparse Lexical Search. However, a practical e-commerce search system goes far beyond merely retrieving static documents from a vector database. For example, a user asks: “I want to buy a 400L Samsung Inverter refrigerator available at the District 1 branch that has an active promotion.” If we rely solely on a Vector Database, we face two critical errors: ...

May 22, 2026 · 8 min · Vesviet Team

Part 6: The Rise of AI Agents - From Reading to Autonomy

1. The Decline of Static RAG In the previous 5 parts, we built a perfect RAG machine: real-time data (CDC), absolute security, and strict authorization. But no matter how perfect, traditional RAG suffers from a fatal flaw: It only knows how to “Read” and “Speak”, not how to “Do”. If you ask a RAG system: “Check if the server is overloaded, and if so, automatically boot up 2 more servers”, it will be completely powerless. RAG is a Static Pipeline running on a one-way street. ...

May 17, 2026 · 4 min · Lê Tuấn Anh

Generative UI with MCP: Architecting AI-Native Frontends

Answer-first: Architecting dynamic generative UI applications with Model Context Protocol (MCP): dynamic registries, client-agent state synchronization, security, and a11y. The first generation of AI-powered chat interfaces followed a simple pattern: the user types a message, the LLM generates text, the UI renders text. The second generation added tool calls — the LLM could invoke functions and render the results as text. The third generation — Generative UI — goes further: the LLM generates not just text responses but interactive UI components that are rendered directly in the browser, enabling experiences that feel less like chatting with a text box and more like using a responsive, intelligent application. ...

June 1, 2026 · 13 min · Lê Tuấn Anh