8
Memory Layers
9
Storage Backends
13+
Framework Adapters
90%
Cost Reduction
Why Chengeta
Agentic systems are expensive because they are forgetful. Chengeta AI is the layer that makes them remember.
Supported Frameworks
Drop in anywhere — no call signatures change.
LangChainLangGraphAutoGenCrewAIAgnoA2AOpenAIAnthropicGeminiGoogle ADKLlamaIndexOpenAI AgentsClaude Agent
Eight Memory Layers
Each layer preserves one kind of artifact, with serialization tuned to its data.
01
ResponseCache
LLM output by model + messages + params
02
EmbeddingCache
Vectors by model + text, stored as bytes
03
RetrievalCache
Documents by query + retriever + top-k
04
ContextCache
Conversation turns by session + index
05
SemanticCache
Answers for cosine-similar queries
06
AdaptiveSemanticCache
Semantic + auto-tuning threshold
07
StreamingResponseCache
Buffered stream replay as a generator
08
PromptCacheLayer
Provider cache_control + savings tracking
Performance Benchmarks
A warm cache turns network round-trips into local reads. Figures are representative of typical workloads.
Architecture
A request flows through adapter and middleware into the memory layers. On a hit, it returns instantly. On a miss, the real call runs once, the result is preserved, and every future request is served from memory.
Framework Adapter
LangChain · CrewAI · OpenAI …
→
Middleware
wrap any callable
→
Memory Layers
8 cache layers
→
Backend
KV or Vector store
Miss → real API call → preserve in memory → return
Join the Community
Chengeta AI is built in the open. Bring your frameworks, your scale, and your ideas.