Latest deep dive
How I built a self-correcting agent that rewrites its own plan when it fails
Most agents fail silently. This post walks through the architecture of a reflective agent loop — with a critic node, dynamic re-planning, and escape hatches for when the LLM hallucinates a tool call. Full code included.
📚 Series — LangGraph Masterclass · 6 parts
01
Why LangGraph beats LangChain for complex agents
02
Stateful graphs — checkpoints, memory, and persistence
03
Conditional edges and dynamic routing
04
Human-in-the-loop interrupts and approvals
05
Multi-agent supervisor patterns
06
Deploying LangGraph agents to production on Azure
Tutorial
Dec 2024
· 14 min
Building a RAG pipeline that actually works in production — not just demos
Chunking strategies, embedding choices, reranking, and the 12 ways your retrieval silently fails under real load. Every lesson from shipping RAG to enterprise.
Breakdown
Dec 2024
· 11 min
I dissected OpenAI's Swarm framework — here's what's actually happening under the hood
Swarm looked like magic when it dropped. Three hours of reading source code later, here's the full breakdown — the handoff pattern, the context variable trick, and why it's both clever and limited.
Building an autonomous research agent with LangGraph, Tavily, and Claude — start to finish
Full build log: planning agent, web search tool, structured extraction, synthesis node, and a report generator. Every node, every edge, full code repo. This is the post I wish existed when I started.
LangGraph vs CrewAI in 2025 — an honest comparison after shipping both to production
Not a framework war. A real comparison from someone who's debugged both at 2am in production. Different tools for different problems — here's how to actually choose.
Tutorial
Oct 2024
· 16 min
Agent memory that actually persists — a practical guide to long-term memory patterns
Semantic memory, episodic memory, procedural memory — the three memory types your agent needs, and how to implement each using vector stores, SQL, and simple key-value patterns.
Breakdown
Oct 2024
· 9 min
The ARC-AGI-3 benchmark — what it tests and why current agents keep failing it
New reasoning benchmark, same old problem: fluid intelligence. Breaking down what it measures, how frontier models score, and what it reveals about the ceiling of current agentic approaches.
Tool calling patterns that scale — how to design agent tools without shooting yourself in the foot
Naming conventions, error handling contracts, idempotency, streaming vs. blocking — the field guide to tool design that every agentic engineer needs. With examples from a real production system.