Skip to main content

QdrantBackend

Qdrant-based vector similarity backend. Fastest vector DB in 2026 (22ms p95 at 10M vectors). Supports both in-memory mode (no server required) and remote Qdrant Cloud / self-hosted.

Installation

pip install 'chengeta-ai[vector-qdrant]'

Usage

In-Memory (no server needed)

from chengeta_ai.backends.vector_backend import QdrantBackend
from chengeta_ai import SemanticCache, InMemoryBackend, CacheKeyBuilder

vector_backend = QdrantBackend(dim=1536) # default: :memory:

cache = SemanticCache(
exact_backend=InMemoryBackend(),
vector_backend=vector_backend,
embed_fn=lambda text: embed_model.encode(text),
threshold=0.92,
)

Remote (Qdrant Cloud)

vector_backend = QdrantBackend(
url="https://your-cluster.qdrant.io",
api_key="your-api-key",
collection="chengeta",
dim=1536,
)

Self-Hosted

docker run -p 6333:6333 qdrant/qdrant
vector_backend = QdrantBackend(
url="http://localhost:6333",
collection="chengeta",
dim=1536,
)

Why Qdrant?

MetricQdrantFAISS (in-proc)Chroma
p95 latency @ 10M vectors22ms~5ms (no server)~80ms
True deletion support❌ (workaround)
Persistent by default
Distributed / multi-node
Hybrid search (vector + BM25)

Use Qdrant when you need production-grade persistence, true deletion, and sub-30ms latency at scale.


API Reference

ParameterTypeDefaultDescription
urlstr":memory:"Qdrant URL or :memory: for in-process
collectionstr"chengeta"Qdrant collection name
api_keystr | NoneNoneQdrant Cloud API key
dimint1536Embedding dimension