*Result*: Search at Scale: Quantifying the Performance Trade-offs of ANNS on Milvus.
*Further Information*
*Efficient similarity search over high-dimensional embeddings is a core requirement of modern AI systems, yet most evaluations of Approximate Nearest Neighbor Search (ANNS) methods overlook the system-level dynamics that govern real-world performance. This study presents a comprehensive benchmark of six widely used ANNS algorithms — FLAT, IVF-FLAT, IVF-SQ8, HNSW, DiskANN, and ScaNN — integrated into Milvus and evaluated across both standalone and distributed (k3s-based) deployments. To capture modality-dependent behavior, the analysis spans two medium-size datasets, each consisting of 1.6 million embeddings: a large Turkish text corpus and a visual dataset generated using a unified 768-dimensional representation space. Each algorithm is assessed using recall@10, latency, throughput, and tail-latency (tp50–tp99), alongside detailed profiling of CPU utilization, RAM consumption, and disk I/O during indexing and querying. The results reveal distinct operational signatures: ScaNN and IVF-SQ8 achieve the highest throughput — particularly for image embeddings; HNSW offers superior recall but imposes greater memory pressure; and DiskANN provides stable, memory-efficient performance with a bounded recall ceiling for text. Distributed experiments further show that graph-based indexes benefit from multi-node parallelism, while disk-aware methods maintain consistent performance. Overall, the findings demonstrate that ANNS selection requires a holistic evaluation that accounts for both retrieval quality and system-level resource behavior under realistic workloads. [ABSTRACT FROM AUTHOR]
Copyright of International Journal of Software Engineering & Knowledge Engineering is the property of World Scientific Publishing Company and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)*