Adaptive Semantic Cache + Chengeta AI

Auto-tune the similarity threshold based on observed hit rate — no manual threshold tuning required.

Install

pip install 'chengeta-ai[vector-faiss]'

Example

import numpy as np
from chengeta_ai.backends.memory_backend import InMemoryBackend
from chengeta_ai.backends.vector_backend import FAISSBackend
from chengeta_ai.layers.adaptive_semantic_cache import AdaptiveSemanticCache

# Simple deterministic embedder for demo
def embed(text: str) -> np.ndarray:
    rng = np.random.default_rng(hash(text) % (2**32))
    return rng.random(128).astype(np.float32)

cache = AdaptiveSemanticCache(
    exact_backend=InMemoryBackend(),
    vector_backend=FAISSBackend(dim=128),
    embed_fn=embed,
    threshold=0.90,         # starting threshold
    target_hit_rate=0.35,   # self-tune toward 35% hits
    adjustment_interval=10, # re-tune every 10 lookups
    threshold_min=0.70,
    threshold_max=0.99,
    max_turn_count=5,       # skip semantic for > 5-turn chats
)

# Populate
cache.set("What is the capital of France?", "Paris")
cache.set("What is semantic caching?", "Caching by meaning.")

# Exact hit
print(cache.get("What is the capital of France?"))  # "Paris"

# Semantic hit (similar wording)
print(cache.get("Capital city of France?"))  # "Paris" (if above threshold)

# Multi-turn guard — turn_count > max_turn_count skips semantic lookup
print(cache.get("Tell me more", turn_count=6))  # None (skipped)

print(f"Current threshold: {cache.current_threshold:.3f}")

With real embeddings

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")

cache = AdaptiveSemanticCache(
    exact_backend=InMemoryBackend(),
    vector_backend=FAISSBackend(dim=384),
    embed_fn=lambda text: model.encode(text),
    target_hit_rate=0.35,
    max_turn_count=5,
)

Why adaptive?

Production semantic caches hit only 20–45% of traffic with a fixed threshold. AdaptiveSemanticCache observes your actual hit rate and nudges the threshold automatically — higher if too many false negatives, lower if false positives appear.

Install​

Example​

With real embeddings​

Why adaptive?​

Install

Example

With real embeddings

Why adaptive?