Mundher's blog

Mundher's blog

Skip to main content

Hashnode Mundher's blog

Google Scholar
About me
External Articles

Command Palette

Search for a command to run...

Latest articles

How to Build Privacy into LLM Agents Without Breaking Their Brains
I've been spending a lot of time lately building autonomous agents with ReAct and RAG frameworks. The shift from stateless chatbots to stateful systems that can actually execute tools is incredibly us
Jul 19, 20263 min read
Stacking Agent Memory: Checkpoints, Status Boards, and Active Context
I love the idea of autonomous coding agents. But one of the quickest ways to hit a wall when setting them up is the infinite context problem. Your context window is finite, but the work you want the a
Jul 4, 20263 min read
Mundher's blog
28 posts
Mundher Al-Shabi
Smart LLM Routing
Building LLM apps is easy, but scaling them without setting a pile of money on fire is hard. You really don't need the massive brainpower of GPT-5 for every single user query. Routing is how we fix th
Jun 26, 20264 min read
Using Simulators to Evaluate Multi-Turn AI Agents
Building a multi-turn conversational AI is surprisingly easy right now. Evaluating it is incredibly hard. For single-turn tasks, a standard static dataset works fine: you just feed in a prompt and ass
Jun 19, 20264 min read
Why Grep Won't Save Your RAG Pipeline
I’ve been reading through a recent paper titled "Is Grep All You Need? How Agent Harnesses Reshape Agentic Search". It’s a provocative piece with a premise I normally love. The authors claim that simp
Jun 9, 20263 min read
Harnessing Conversational AI
I’ve been spending the last few weeks messing around with open-weight models to build conversational interfaces. By now, the new reality is obvious: generating natural language is no longer the bottle
May 24, 20263 min read
Vectorless RAG
If you’ve built anything with LLMs in the past couple of years, you’ve probably wired up a Retrieval-Augmented Generation (RAG) pipeline. The playbook is burned into our brains: take a PDF, smash it i
May 17, 20265 min read
Thoughts on Advanced Chunking Strategies for RAG
I’ve been thinking a lot recently about the "chunking problem" in Retrieval-Augmented Generation. If you've played around with the llm CLI tool or built anything with vector embeddings, you've probabl
May 4, 20263 min read
All You Need is a Good Chunking
If you’ve spent any time building Retrieval-Augmented Generation (RAG) prototypes, you inevitably hit the exact same wall. You wire up a great embedding model, point it at an excellent local LLM, and
May 3, 20264 min read

© 2026 Mundher's blog

Members
Archive
Privacy
Terms