AnthropicCacheAdapter
Wraps anthropic.Anthropic and anthropic.AsyncAnthropic messages.create calls with response caching. Returns cached results for identical (model, messages, system, params) combinations.
Installation
pip install 'chengeta-ai[anthropic]'
Usage
Sync
import anthropic
from chengeta_ai import CacheManager, InMemoryBackend, CacheKeyBuilder
from chengeta_ai.adapters.anthropic_adapter import AnthropicCacheAdapter
client = anthropic.Anthropic()
manager = CacheManager(
backend=InMemoryBackend(),
key_builder=CacheKeyBuilder(namespace="myapp"),
)
adapter = AnthropicCacheAdapter(client, manager)
response = adapter.messages_create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Explain vector embeddings"}],
)
# Second call — cache hit, no API cost
response = adapter.messages_create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Explain vector embeddings"}],
)
Async
client = anthropic.AsyncAnthropic()
adapter = AnthropicCacheAdapter(client, manager)
response = await adapter.amessages_create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Explain vector embeddings"}],
)
With system prompt
response = adapter.messages_create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a concise technical writer.",
messages=[{"role": "user", "content": "What is RAG?"}],
)
Combine with PromptCacheLayer
For long system prompts, use PromptCacheLayer alongside AnthropicCacheAdapter:
AnthropicCacheAdaptercaches the full response (eliminates entire API calls)PromptCacheLayerinjectscache_controlon system prompts (saves tokens within live calls)
Invalidate by model
adapter.invalidate_model("claude-sonnet-4-6")
API Reference
| Method | Description |
|---|---|
messages_create(**kwargs) | Cached client.messages.create() (sync) |
amessages_create(**kwargs) | Cached client.messages.create() (async) |
invalidate_model(model) | Invalidate all cached responses for a model |