Blog

Filtered by: prompt-caching× clear

Context Window Engineering Implementation: A Production Guide to All Five Anthropic Techniques

Context window engineering implementation guide covering all five Anthropic techniques: prompt caching, tool search, programmatic tool calling, compaction, and the advisor strategy. Real SDK code included.

Context Window Engineering: How Anthropic Thinks About Production AI Agents

Brad Abrams, Anthropic Head of Product for Claude Platform, shares three context window engineering techniques that cut agent costs by 90% and boost model intelligence. Here is what I took away.

We Cut 63% of Our LLM Costs by Sending Less Context

We moved from 128K to 32K context windows. Costs dropped 63%. Accuracy went up. Here is how metadata filtering, re-ranking, and dynamic budgets do it.

Claude Code Routines: A Deep Dive

Claude Code Routines let you chain tool calls into reusable agentic workflows. Here is how the desktop redesign, multi-session support, and 1-hour prompt caching reshape daily work.

Prompt Caching in Agno — Four Rules

Four rules for cache-aware agents in Agno. A single parameter was costing me 1.25× on every turn. Here is what I learned building with Agno and Claude.

LangGraph Prompt Caching: Patterns and Anti-Patterns

LangGraph prompt caching patterns, the 6 anti-patterns that kill cache hit rates, and how to monitor caching in production agents.

The Static-First Prompt Architecture

Prompt caching is not a feature you toggle on. It is an architectural constraint. Here is the layered structure and 4-breakpoint strategy that makes it work reliably.