Deployment
How to run chengeta-ai in production — Docker, Docker Compose, CI/CD, and a production checklist.
Docker
A Dockerfile is included in the repo root. It uses a slim Python 3.12 image with uv.
Build and run
docker build -t chengeta-ai .
# Show CLI help
docker run --rm chengeta-ai
# Stats (reads CHENGETA_* env vars)
docker run --rm \
-e CHENGETA_BACKEND=memory \
chengeta-ai uv run python -m chengeta_ai stats
With Redis backend
docker run --rm \
-e CHENGETA_BACKEND=redis \
-e CHENGETA_REDIS_URL=redis://host.docker.internal:6379/0 \
chengeta-ai uv run python -m chengeta_ai stats
Docker Compose (Redis + your app)
# docker-compose.yml
version: "3.9"
services:
redis:
image: redis:7-alpine
ports:
- "6379:6379"
command: redis-server --maxmemory 512mb --maxmemory-policy allkeys-lru
app:
build: .
environment:
CHENGETA_BACKEND: redis
CHENGETA_REDIS_URL: redis://redis:6379/0
CHENGETA_NAMESPACE: myapp
CHENGETA_DEFAULT_TTL: "3600"
CHENGETA_TTL_RESPONSE: "600"
CHENGETA_TTL_EMBEDDING: "86400"
depends_on:
- redis
docker compose up -d
GitHub Actions CI
The repo ships with CI/CD workflows in .github/workflows/:
| Workflow | Trigger | What it does |
|---|---|---|
ci.yml | push/PR to main | Tests on 6 matrix legs (3 OS × Python 3.12/3.13), pre-commit |
publish.yml | GitHub Release published | Build → TestPyPI → PyPI (OIDC) |
release.yml | git push tag v* | Build wheel, GitHub Release, PyPI via PYPI_TOKEN |
No changes needed — CI runs automatically on every push.
Environment Variables
All settings are driven by CHENGETA_* environment variables. See Settings for the full list.
# Minimal production config
CHENGETA_BACKEND=redis
CHENGETA_REDIS_URL=redis://your-redis:6379/0
CHENGETA_NAMESPACE=myapp-prod
CHENGETA_DEFAULT_TTL=3600
Load in Python:
from chengeta_ai import CacheManager, ChengetaSettings
manager = CacheManager.from_settings(ChengetaSettings.from_env())
Production Checklist
Backend
- Use
RedisBackendorTieredBackend(notInMemoryBackend) — survives restarts, shared across workers - Set
maxmemoryandmaxmemory-policy allkeys-lruin Redis config - For high read volume:
TieredBackend(l1=InMemoryBackend(), l2=RedisBackend(), l1_ttl=300)
Serialization & Compression
- Use
JsonSerializerif Redis is shared across services (safer than pickle) - Enable
GzipCompressor(level=6)for large LLM responses stored in Redis
TTLs
- Set per-layer TTLs appropriate to your data freshness requirements:
CHENGETA_TTL_RESPONSE=600(10 min — LLM responses)CHENGETA_TTL_EMBEDDING=86400(24 h — embeddings rarely change)CHENGETA_TTL_RETRIEVAL=3600(1 h)CHENGETA_TTL_CONTEXT=1800(30 min — session data)
Observability
- Export metrics to Prometheus or OTEL — see Observability
- Monitor
hit_rate— target ≥ 0.3 in production - Alert on
evictionsspike — means cache is too small or TTLs too long
Security
- Scope
PYPI_TOKENto the specific package - Use
JsonSerializerif the Redis instance is accessible by untrusted clients - Rotate API keys; never commit
.envfiles
Cache Warming
- Run
CacheWarmer.warm_from_file("hot_queries.csv", ...)on startup for known high-traffic queries — see CacheWarmer