Blog | Anirudh Negi

May 20, 2026 Model Reliability

Canary Model Upgrades for AI Coding Agents Without Surprise Regressions

A practical guide to canarying model upgrades for AI coding agents with eval slices, fallback lanes, budget guardrails, and promotion scorecards so model refreshes do not quietly degrade code quality.

May 19, 2026 Agent Reliability

Freshness Guards for Long-Running AI Coding Agents

A practical guide to freshness guards for long-running AI coding agents using repo fingerprints, file watchlists, context TTLs, and revalidation gates so stale context does not quietly turn into bad patches.

May 18, 2026 MCP Compatibility

Versioning MCP Servers Without Breaking Every Agent Client

A practical guide to versioning MCP servers with compatibility contracts, capability negotiation, deprecation windows, and handshake tests so MCP upgrades stay boring instead of turning into client roulette.

May 17, 2026 Tooling Architecture

Tool Capability Manifests for AI Agents That Need to Pick the Right Tool

A practical guide to tool capability manifests for AI agents using side-effect metadata, cost hints, auth lanes, and planner scoring so planners stop treating every tool as equally safe, cheap, and reversible.

May 16, 2026 Reliability

Backpressure and Concurrency Caps for AI Agents That Would Otherwise DDoS Your Own Tools

A practical guide to backpressure for AI agents using queue budgets, concurrency caps, lease-aware workers, and overload signals so tool-calling systems stop amplifying latency into self-inflicted outages.

May 15, 2026 Refactoring

AST Codemods for AI Coding Agents That Need to Rewrite Large Codebases Safely

A practical guide to AST codemods for AI coding agents using syntax-aware transforms, fixture-led verification, and rollout checkpoints so large API migrations stay reviewable instead of devolving into regex cleanup.

May 14, 2026 Dependency Management

Dependency Update Lanes for AI Coding Agents Without Surprise Regressions

A practical guide to dependency update lanes for AI coding agents using risk tiers, SBOM diffs, focused verification, and staged promotion so automated upgrades stay fast without turning every version bump into reviewer roulette.

May 13, 2026 Reproducibility

Environment Manifests for AI Coding Agents That Reproduce the Bug You Meant to Fix

A practical guide to environment manifests for AI coding agents using pinned toolchains, seeded fixtures, and verifier snapshots so patches reproduce the real bug instead of fixing a local fiction.

May 12, 2026 Tool Reliability

Tool Error Handling Patterns for AI Agents That Fail Loudly and Recover Cleanly

Design AI agent tool calls to fail loudly, classify errors, retry safely, and surface human-actionable recovery paths instead of burying broken automation behind vague exceptions.

May 11, 2026 Verification

Speculative Patch Generation for AI Coding Agents That Stays Reviewable

A practical guide to speculative patch generation for AI coding agents using parallel candidate branches, verifier scorecards, and deterministic promotion rules so teams can explore multiple fixes without turning code review into chaos.

May 10, 2026 Hybrid AI Workflows

Hybrid Local and Cloud AI Dev Workflows Without Losing Control

A practical comparison of hybrid local and cloud AI development workflows using task routing, privacy lanes, local-first defaults, and explicit escalation rules so teams get faster iteration without turning every prompt into a cloud dependency.

May 9, 2026 AI Security

Secrets Management for AI Agents Without Handing Them the Kingdom

A practical guide to secrets management for AI agents using brokered access, short-lived credentials, OIDC, scoped injection, and audit trails that reduce blast radius without killing automation.

May 8, 2026 Caching

Semantic Caching for AI Coding Agents Without Serving Stale Bugs

A practical guide to semantic caching for AI coding agents using fingerprinted prompts, repo-aware invalidation, trust boundaries, and fallback verification so cached answers stay fast without becoming quietly wrong.

May 7, 2026 Verification

Verifier Pipelines for AI Coding Agents That Catch Bad Edits Early

A practical guide to verifier pipelines for AI coding agents using staged checks, AST-aware rules, policy gates, and escalation paths that catch risky edits before they become reviewer cleanup.

May 6, 2026 AI Security

Capability Leases for AI Agents That Need Temporary Access

A practical guide to capability leases for AI agents using short-lived grants, approval-bound scopes, and audit trails so powerful tools stay available without becoming always-on risk.

May 6, 2026 AI Security

Capability Leases for AI Agents That Need Temporary Access

A practical guide to capability leases for AI agents using short-lived grants, approval-bound scopes, and audit trails so powerful tools stay available without becoming always-on risk.

May 5, 2026 Reliability

Circuit Breakers for AI Agents That Touch Real Systems

A practical guide to stopping cascading failures in AI agents with circuit breakers, budget guards, and cooldown policies around model calls and external tools.

May 4, 2026 Agent Debugging

Trace-Driven Debugging for Multi-Step AI Agent Failures

How to debug multi-step AI agent failures with trace IDs, span annotations, replay packets, and failure buckets so incidents stop feeling like prompt necromancy.

May 3, 2026 AI Infrastructure

Dynamic Model Routing for AI Coding Agents Without Burning Budget

How to route coding-agent tasks between fast and expensive models using task scoring, verification, and escalation rules that reduce cost without quietly lowering quality.

May 2, 2026 Context Engineering

Usable Context vs Advertised Context for AI Coding Agents

A practical guide to why huge advertised context windows still fail in coding workflows, with concrete patterns for bounded context packets, rolling summaries, pinned files, and retrieval refresh loops.

May 1, 2026 Production AI

Shadow Mode Rollouts for AI Features Before You Let Them Act

A practical guide to shipping AI features safely with shadow traffic, side-by-side scoring, redaction, and promotion gates that expose failures before the model gets permission to act.

April 30, 2026 Testing

AI-Generated Tests That Actually Help, and Where They Quietly Fail

A practical guide to using AI-generated tests for fast coverage gains without trusting them blindly, including golden paths, mutation checks, flaky-test traps, and review rules that expose weak assertions.

April 25, 2026 Code Retrieval

Hybrid Code Retrieval for AI Coding Agents That Beats Full-Repo Prompt Stuffing

A practical guide to hybrid code retrieval for AI coding agents using embeddings, tree-sitter structure, ripgrep fallback, and reranking so prompts stay smaller and edits stay more accurate.

April 24, 2026 Local LLMs

Quantization Tradeoffs for Coding Models on Consumer Hardware

A practical guide to choosing 4-bit, 6-bit, 8-bit, and mixed quantization for coding models on consumer hardware, with concrete runtime tradeoffs for Ollama, llama.cpp, and vLLM.

April 23, 2026 Production Reliability

Agent-Safe Database Migrations with Expand-Contract and Rollback Checkpoints

A practical guide to shipping database changes with AI coding agents using additive schema changes, dual writes, bounded backfills, and verification steps that preserve rollback paths.

April 22, 2026 Multi-Agent Systems

Planner-Executor Handoffs for AI Agents That Stay Reviewable

A practical guide to planner-executor handoffs for AI agents using task manifests, claim checks, verification gates, and rollback-friendly execution so multi-agent workflows stay understandable under real load.

April 21, 2026 Automation

Cron-Based AI Automation That Does Not Become a Mess

A practical guide to running scheduled AI workflows with clean task payloads, isolated runs, idempotency guards, and approval-aware delivery so cron automation stays useful instead of becoming a haunted queue.

April 20, 2026 Agent Architecture

Session Isolation Patterns for Long-Lived AI Agents

A practical guide to isolating long-lived AI agents with session-scoped memory, worktrees, tool lanes, and approval boundaries so one task cannot quietly contaminate another.

April 19, 2026 AI Evaluation

Eval Harnesses for AI Coding Agents That Actually Catch Bad Patches

A practical guide to building eval harnesses for AI coding agents with patch tasks, invariant checks, sandboxed verification, and scorecards that expose bad fixes before they hit code review.

April 18, 2026 AI Security

Using GitHub Actions OIDC for AI Agents Without Long-Lived Cloud Keys

A practical guide to using GitHub Actions OIDC for AI agents so CI jobs can assume short-lived cloud roles for deploys, eval runs, and controlled automation without storing long-lived cloud secrets.

April 17, 2026 AI Security

Protecting AI Agents From Prompt Injection Through Tool Outputs

A practical guide to stopping prompt injection through tool outputs by treating external content as tainted data, using trust lanes, sanitizers, execution guards, and human review boundaries.

April 16, 2026 Context Engineering

Context Packets for Engineering Agents That Actually Reduce Bad Edits

A practical guide to building context packets for engineering agents with task manifests, repo maps, bounded evidence bundles, and relevance scoring that improve edit quality.

April 15, 2026 AI Agents

Approval Workflows for AI Agents That Can Actually Write Code

A practical guide to approval workflows for AI agents that can write code, open PRs, and call tools without turning every useful action into a security incident.

April 14, 2026 Local LLMs

Ollama vs llama.cpp vs vLLM for Local AI Development

A practical comparison of Ollama, llama.cpp, and vLLM for local AI development, covering setup friction, quantization, API compatibility, and where each runtime actually fits.

April 13, 2026 Tool Calling

Schema-First Tool Contracts for AI Agents That Fail Closed

A practical guide to building schema-first tool contracts for AI agents with JSON Schema, runtime validation, adapter layers, and fail-closed execution paths that keep tool use reliable and reviewable.

April 12, 2026 Reliability

Retry and Recovery Patterns for Long-Running AI Agent Jobs

A practical guide to making long-running AI agent jobs retryable and recoverable with idempotency keys, durable checkpoints, and safer human handoff paths.

April 11, 2026 MCP

MCP Transport Choices for Real Agent Workflows, stdio vs HTTP vs SSE

A practical guide to choosing between stdio, HTTP, and SSE for MCP servers, with deployment patterns, auth tradeoffs, and failure modes that matter in real agent systems.

April 10, 2026 Local LLMs

How to Run Claude Code with Unsloth Qwen3.5-35B-A3B on a Mac mini M4 Pro

A detailed step-by-step guide to running Claude Code with Unsloth Qwen3.5-35B-A3B on a Mac mini M4 Pro, including model choice, local inference setup, KV cache fixes, tuning, and practical tradeoffs.

April 10, 2026 Git

Git Worktrees for Parallel AI Coding Agents Without Repo Chaos

A practical guide to running multiple AI coding agents safely with Git worktrees, branch-per-task isolation, runtime guardrails, and merge discipline that keeps parallel work fast and reviewable.

April 9, 2026 Gemini CLI

Gemini CLI Terminal Debugging Workflows That Actually Shorten Incident Time

A practical guide to using Gemini CLI for terminal-first debugging with plan mode, shell tooling, focused context, and repeatable fix loops that keep production bug hunts fast and reviewable.

April 8, 2026 OpenClaw

Building Reusable OpenClaw Skills for Structured Repo Automation

A detailed guide to designing reusable OpenClaw skills with a scheduled repo worker example, including project-memory vs task-engine modes, repo-local state, PR packaging runs, and reviewer-in-the-loop automation.

April 6, 2026

Sandboxing AI Coding Agents Without Killing Developer Velocity

A practical guide to isolating AI coding agents with dev containers, read-only mounts, explicit network policy, and microVMs so they can stay useful without getting a blank check on your machine.

AI Coding Security Dev Containers Operations

April 5, 2026

Repo Playbooks for AI Coding Agents That Actually Change Behavior

A practical guide to using repo-specific instruction files like AGENTS.md and CLAUDE.md to make AI coding agents safer, more consistent, and much easier to review.

AI Coding Developer Workflow Agent Guardrails Prompt Engineering

April 4, 2026

Cursor Rules for Monorepos That Keep AI Edits on the Rails

A practical guide to using Cursor effectively in large monorepos with layered rules, scoped context, repo maps, verification loops, and guardrails that stop AI edits from wandering across the codebase.

Cursor Monorepos AI Coding Developer Workflow

April 3, 2026

Reviewing AI-Generated Pull Requests Without Becoming the Bottleneck

A practical guide to reviewing AI-generated pull requests with scoped diffs, invariant checklists, CI guardrails, and fast human approval loops.

AI Coding Code Review GitHub Developer Workflow

April 2, 2026

Prompt Caching for Tool-Heavy AI Agents That Need to Stay Fast

A practical guide to reducing latency and cost in tool-heavy AI agents with stable prompt prefixes, prompt caching, context compaction, and retrieval boundaries.

AI Agents Prompt Caching Cost Optimization Developer Workflow

April 1, 2026

Tracing AI Agents with OpenTelemetry Without Drowning in Logs

A practical guide to using OpenTelemetry and OpenInference to trace agent workflows, inspect tool calls, track token spend, and debug multi-step failures without drowning in flat logs.

AI Agents OpenTelemetry Observability LLM Ops

March 31, 2026

Local LLM Dev Environments That Stay Fast, Cheap, and Private

A practical guide to building a local LLM development environment with Ollama, Open WebUI, coding tools, quantization choices, prompt/version hygiene, and smart fallbacks when local models are not enough.

Local LLMs Ollama AI Coding Developer Workflow

March 30, 2026

Practical RAG Evaluation Techniques That Catch Real Failures

A practical guide to evaluating RAG systems by separating retrieval from generation, building grounded test sets, tracking the right metrics, and running regression checks before users find the bugs.

RAG Evaluation LLM Ops Developer Workflow

March 29, 2026

Agent Memory Strategies That Don't Rot in Production

A practical guide to designing memory for AI agents with working memory, episodic summaries, semantic facts, write policies, retrieval rules, and debugging loops that stay useful over time.

AI Agents Memory Developer Workflow LLM Ops

March 28, 2026

Claude Code Workflows for Large Refactors Without Losing the Plot

A practical guide to using Claude Code for large refactors with specs, checkpoints, verification loops, and guardrails that keep big changes understandable.

Claude Code Refactoring AI Coding Developer Workflow

March 27, 2026

How to Harden OpenClaw on a VPS Without Breaking the Good Parts

A practical guide to hardening OpenClaw on a VPS with least-privilege credentials, network restrictions, approval boundaries, cron hygiene, and operations checks that actually matter.

OpenClaw VPS Security Operations

March 26, 2026

Secure MCP Server Design Patterns for Real Tool-Calling Agents

A practical guide to building MCP servers that stay useful under real agent workloads without turning into a security liability. Covers tool boundaries, approval flows, validation, session isolation, prompt-injection defenses, and observability.

MCP AI Tools Security Developer Workflow

January 15, 2024

GameHub - Built with AI Assistance

How I built a collection of 8 browser games (Snake, Pong, Tic Tac Toe, Memory, Breakout, Flappy, 2048, and Sudoku) using Claude Code and Qwen for AI pair programming. Explores game loops, collision detection, AI opponents, and the collaborative development process.

AI-Assisted Games JavaScript HTML5

January 12, 2024

AI Tutor - Built with AI Assistance

How I built an AI-powered document-to-quiz generator that converts PDFs and textbooks into chapter-level quizzes. Covers document parsing, question generation, quiz engines, and using AI for educational technology.

AI-Assisted Education Python NLP

January 10, 2024

Mortgage Atlas - Built with AI Assistance

How I built a comprehensive mortgage cost calculator that models the true cost of homeownership. Includes financial formulas, amortization schedules, data visualization, and AI-assisted TypeScript development.

AI-Assisted Finance TypeScript Data Viz

January 8, 2024

Stock Analysis - Built with AI Assistance

How I built a comprehensive stock tracking and analysis tool with technical indicators, portfolio management, and visualizations. Covers API integration, RSI, MACD, Bollinger Bands, and AI pair programming for financial tools.

AI-Assisted Finance Python Data Analysis