Context Engineering for AI Coding: AGENTS.md, Cursor Rules & RAG

In 2025, METR — an AI safety and capability research organization — ran a rigorous randomized controlled trial. Sixteen experienced open-source developers worked on 246 real-world tasks, each randomly assigned to either use AI coding tools freely or not at all. The result was counterintuitive: developers using AI tools were 19% slower on complex tasks. Before the study, those same developers predicted AI would make them 24% faster. After completing the experiment — still believing they had gone faster — their subjective confidence remained completely unshaken. ...

May 31, 2026 · 13 min · Lê Tuấn Anh

Data Ingestion & Atomic Chunking Product Data

In Part 1: The Paradigm Shift - Agentic Architecture & Golang Orchestration Power, we established the Orchestration Engine using Golang and Eino. However, no matter how smart a brain is, it becomes useless if fed with misleading, unstructured, or fragmented information. In the e-commerce domain, product catalog data changes continuously every second: prices fluctuate, inventory is updated, new products are added. Meanwhile, chunking product data to feed into a Vector Database (Qdrant) is entirely different from chunking a PDF document or a news article. ...

May 22, 2026 · 8 min · Vesviet Team

Part 2 — State, Memory & Context Management

Prerequisite: To firmly grasp the foundational concepts of Memory Architecture in AI systems, please review Comprehensive AI-Native System Architecture. After solving the Agent communication challenge in Part 1, we must face the LLM’s greatest enemy: Context Window limits. Even the best Orchestrator is useless if Worker Agents forget the User’s initial request after just a few tool-calling turns. 2.1. The Context Window Problem and Why Agents “Forget” Large Language Models (LLMs) are inherently Stateless. Every time you send a prompt, the LLM rereads the entire text from beginning to end. ...

May 17, 2026 · 5 min · Lê Tuấn Anh

Part 6 — From Prompting to Context Engineering

The Biggest Shift in 2026: Context Over Phrasing If you have been writing prompts by carefully choosing words and hoping the model “gets it,” you are operating on a 2024 mental model. In 2026, the industry consensus is clear: the quality of the context you assemble matters far more than the phrasing of your instructions. This shift has a name: Context Engineering. What Is Context Engineering? Context Engineering is the discipline of designing systems that assemble the right information into the model’s context window at the right time. ...

May 9, 2026 · 4 min · Lê Tuấn Anh

Fine-Tune vs Prompt-Engineer an LLM: Decision Guide

Answer-first: A clear decision framework for AI engineers: when to fine-tune (LoRA/QLoRA), when to prompt-engineer, and when RAG is the right answer instead. Three engineers on the same team are trying to build the same thing: a customer support assistant that answers questions in the company’s specific support style, using terminology from their product documentation. One engineer says “just write a better system prompt.” Another says “we need to fine-tune a model.” The third says “this is clearly a RAG problem.” ...

June 1, 2026 · 12 min · Lê Tuấn Anh

GraphRAG vs Naive RAG: Enterprise Architecture Guide

Answer-first: Compare Naive RAG with GraphRAG for enterprise AI pipelines: knowledge graphs, LlamaIndex, chunking, streaming CDC, and security controls for dynamic data. Most RAG (Retrieval-Augmented Generation) implementations look the same: chunk documents, embed them into vectors, store them in a vector database, retrieve by cosine similarity, and inject the top-K chunks into the LLM context. This works for simple document Q&A. It fails systematically for enterprise knowledge bases where the answer to a question depends not on a single document chunk, but on the relationships between dozens of interconnected entities. ...

June 1, 2026 · 12 min · Lê Tuấn Anh

Tech Radar, May 1, 2026: DigitalOcean's AI-Native Cloud - Inference Routing, Managed Retrieval, and an Integrated Stack for Agentic Systems

DigitalOcean’s April 28, 2026 launch of its AI-Native Cloud is not the largest AI infrastructure announcement of the week, but it may be one of the clearest. Instead of treating AI as a feature added onto a legacy cloud, DigitalOcean is explicitly reorganizing its platform around what production AI systems now look like: multi-model inference, retrieval, routing, state, and long-running agent workflows. That framing matters because it captures a broader industry shift. Teams are moving away from the old pattern of “call one model and return one answer” toward systems that route prompts, retrieve private context, execute tools, and optimize cost across repeated loops. In that world, the hard problem is no longer just model access. It is operating the surrounding system cleanly. ...

May 1, 2026 · 7 min · Lê Tuấn Anh