M

AI Tool

Milvus

Open-source cloud-native vector database built for billion-scale similarity search

Milvus documents a high-performance vector database at milvus.io/docs for storing, indexing, and searching embedding vectors with metadata filtering and hybrid search. Deployment options include Milvus Lite (`pip install pymilvus` for notebooks/edge), Milvus Standalone (single Docker image), and Milvus Distributed on Kubernetes per milvus.io/docs/v2.6.x/install-overview. Official SDKs include PyMilvus, Go, Java, Node.js, and C#; Zilliz Cloud offers managed Milvus. Architecture separates access, coordinator, worker, and storage layers with object storage backends (MinIO, S3, Azure Blob) per milvus.io/docs/architecture_overview.

Category Developer Tools
Pricing Open-source (Apache-2.0) + Zilliz Cloud managed Milvus (see zilliz.com/cloud)
Platforms Docker / Kubernetes / Python / Cloud / Edge
vector-databasesemantic-searchhybrid-search

Use cases

  • Production RAG catalogs at billion-vector scale on Kubernetes
  • Recommendation systems combining vector similarity with structured filters
  • Notebook prototyping with Milvus Lite then migrating to Standalone/Distributed
  • Agent memory layers paired with zilliztech/mcp-server-milvus
  • Multimodal embedding search when combined with external embedders

Key features

  • HNSW, DiskANN, and other ANN indexes with scalar/JSON metadata filtering
  • Milvus Lite, Standalone, and Distributed deployment modes
  • Hybrid dense-sparse search and multi-vector support in recent releases
  • PyMilvus MilvusClient API for collections, insert, search, and query
  • LF AI & Data Foundation project with Zilliz as core maintainer

Who Is It For?

  • ML engineers operating large-scale vector search infrastructure
  • Platform teams evaluating open-source alternatives to single-vendor vector clouds
  • Developers prototyping locally with Milvus Lite before production rollout

Frequently Asked Questions

Is Milvus the same as Zilliz Cloud?
Milvus is the open-source project; Zilliz Cloud is the fully managed service built on Milvus.
Which Python client should I use?
Docs recommend PyMilvus with MilvusClient for current releases—see milvus.io/docs and pymilvus docs.
How do agents connect?
Zilliz maintains mcp-server-milvus (documented at milvus.io/docs/milvus_and_mcp) for MCP clients.

Related

Related

3 Indexed items

Weaviate

Developer ToolsOpen source

Weaviate documents an open-source vector database at docs.weaviate.io/weaviate for storing objects and vector embeddings with semantic, keyword, and hybrid search, RAG, reranking, and agent workflows. The ecosystem includes self-hosted Docker/Kubernetes installs, Weaviate Cloud (console.weaviate.cloud), Query Agent, and Weaviate Embeddings for managed inference. Client libraries include Python (`weaviate-client` v4, requires Weaviate 1.23.7+), TypeScript, Go, and Java with REST, gRPC, and GraphQL APIs per the official documentation.

Qdrant

Developer ToolsOpen source

Qdrant documents an AI-native vector search engine at qdrant.tech/documentation for storing, indexing, and querying high-dimensional vectors with optional payloads, supporting dense, sparse, and multi-vector configurations. Official guides cover Docker/Kubernetes self-hosting, Qdrant Cloud on AWS/GCP/Azure, Hybrid Cloud, Private Cloud, and Qdrant Edge for embedded retrieval. Client libraries include Python (`qdrant-client`), JavaScript/TypeScript (`@qdrant/js-client-rest`), Rust, Go, Java, and .NET with REST and gRPC APIs per the API reference at api.qdrant.tech.

Chroma

Developer ToolsOpen source

Chroma documents an open-source embedding database at docs.trychroma.com for storing and querying vectors, metadata, and full-text fields in Python and JavaScript clients. Official guides cover ephemeral in-memory collections, persistent local storage, self-hosted server deployments, and Chroma Cloud at trychroma.com with authentication tokens. The docs describe collection CRUD, `add`/`query`/`get`/`update`/`delete` APIs, embedding functions (default and third-party), hybrid search, and multitenancy patterns for RAG and agent memory workloads per the documentation index.