QdrantBackend

Qdrant-based vector similarity backend. Fastest vector DB in 2026 (22ms p95 at 10M vectors). Supports both in-memory mode (no server required) and remote Qdrant Cloud / self-hosted.

Installation

pip install 'chengeta-ai[vector-qdrant]'

Usage

In-Memory (no server needed)

from chengeta_ai.backends.vector_backend import QdrantBackend
from chengeta_ai import SemanticCache, InMemoryBackend, CacheKeyBuilder

vector_backend = QdrantBackend(dim=1536)  # default: :memory:

cache = SemanticCache(
    exact_backend=InMemoryBackend(),
    vector_backend=vector_backend,
    embed_fn=lambda text: embed_model.encode(text),
    threshold=0.92,
)

Remote (Qdrant Cloud)

vector_backend = QdrantBackend(
    url="https://your-cluster.qdrant.io",
    api_key="your-api-key",
    collection="chengeta",
    dim=1536,
)

Self-Hosted

docker run -p 6333:6333 qdrant/qdrant

vector_backend = QdrantBackend(
    url="http://localhost:6333",
    collection="chengeta",
    dim=1536,
)

Why Qdrant?

Metric	Qdrant	FAISS (in-proc)	Chroma
p95 latency @ 10M vectors	22ms	~5ms (no server)	~80ms
True deletion support	✅	❌ (workaround)	✅
Persistent by default	✅	❌	✅
Distributed / multi-node	✅	❌	❌
Hybrid search (vector + BM25)	❌	❌	❌

Use Qdrant when you need production-grade persistence, true deletion, and sub-30ms latency at scale.

API Reference

Parameter	Type	Default	Description
`url`	`str`	`":memory:"`	Qdrant URL or `:memory:` for in-process
`collection`	`str`	`"chengeta"`	Qdrant collection name
`api_key`	`str \| None`	`None`	Qdrant Cloud API key
`dim`	`int`	`1536`	Embedding dimension

Installation​

Usage​

In-Memory (no server needed)​

Remote (Qdrant Cloud)​

Self-Hosted​

Why Qdrant?​

API Reference​