Data Engineering SFT: NEFTune & SemDeDup | SLM Playbook

← Series hub ← Previous | Next → In the era of LLMs/SLMs, the classic data science proverb: “Garbage In, Garbage Out” has never been more relevant. When performing Supervised Fine-Tuning (SFT) for Small Language Models (SLMs), data quality and format dictate over 90% of the model’s downstream capabilities. Feeding millions of raw, web-scraped dialogue pairs or low-quality synthetic data directly into your model will overfit it to repetitive phrasing, restrict its reasoning capabilities, and waste thousands of GPU hours. ...

May 22, 2026 · 7 min · Lê Tuấn Anh

Practical QLoRA Fine-tuning: Axolotl & Unsloth | SLM Playbook

← Series hub ← Previous | Next → Full-parameter fine-tuning of a large language model is a luxury. For even an 8B model like Llama 3, updating all weights in 16-bit precision requires massive clusters far beyond the reach of mid-sized teams or startups. To resolve these hardware barriers, Parameter-Efficient Fine-Tuning (PEFT) methods were developed, with LoRA and QLoRA emerging as the dominant paradigms. They allow developers to train multi-billion parameter models on a single consumer GPU (like an RTX 3090, 4090, or A10G) while maintaining near-zero performance degradation compared to full tuning. ...

May 23, 2026 · 7 min · Lê Tuấn Anh