"Kioxia's AiSAQ enhances AI processing efficiency, offering reduced DRAM expenses"
In the rapidly evolving world of Artificial Intelligence (AI), the focus is shifting from generating foundational models with massive and expensive training to cost-effective and scalable ways to create inference solutions. Kioxia, a leading memory solutions provider, is at the forefront of this transition, with its latest open-source release of AiSAQ significantly contributing to the scalability and cost-effectiveness of Retrieval-Augmented Generation (RAG) AI inference systems.
Since 2017, Kioxia has been harnessing AI to improve the output of its NAND fabs, primarily through the use of machine vision. Now, the company is extending its AI capabilities to the realm of RAG, a novel approach that combines traditional information retrieval systems with large language models.
RAG solutions, such as Azure Vector DB and Cosmos DB, allow Large Language Models (LLMs) to produce more accurate, contextually relevant, and up-to-date information, especially when dealing with specific domains or real-time data. Kioxia's AiSAQ open-source project is designed to further enhance the performance of these systems.
The key advantage of AiSAQ lies in its all-in-storage approach. Instead of relying on expensive DRAM, which is limited in capacity and can become a bottleneck in large-scale AI systems, AiSAQ's Approximate Nearest Neighbour Search (ANNS) algorithm operates directly on Solid State Drives (SSDs). This approach allows for larger vector databases, enabling the processing of vast datasets without the constraints of DRAM.
Moreover, AiSAQ offers flexible performance and capacity balancing. The latest update to AiSAQ includes controls that allow system architects to balance search performance (queries per second) against the number of vectors stored. This flexibility is crucial for tailoring RAG systems to specific workloads, ensuring optimal use of resources.
In terms of cost-effectiveness, AiSAQ reduces the dependency on high-capacity DRAM required for large AI models. SSDs provide greater storage capacity at a lower cost compared to DRAM, making it more economical to scale up RAG systems. As an open-source project, AiSAQ also allows developers to access and modify the software freely, leading to community-driven improvements and customizations without the costs associated with proprietary software solutions.
The ability to fine-tune performance and capacity according to specific application needs means that resources are utilized more efficiently, reducing waste and optimizing the cost per query. Kioxia's AiSAQ project enhances the scalability and cost-effectiveness of RAG AI inference systems by leveraging SSDs, providing flexible performance tuning, and maintaining an open-source framework.
In early July, Kioxia released a new version of AiSAQ, offering further improvements and providing system architects with more flexible controls. This release marks another significant step forward in Kioxia's mission to drive innovation in AI technology.
[1] Kioxia Corporation. (2021). Kioxia's AiSAQ Open Source Project for RAG AI Inference System. Retrieved from https://www.kioxia.com/global/business/ai/aisaq/ [2] Microsoft Research. (2021). Disk ANN: Approximate Nearest Neighbor Search on Disks. Retrieved from https://www.microsoft.com/en-us/research/project/disk-ann/ [3] OpenAI. (2021). Approximate Nearest Neighbors. Retrieved from https://openai.com/blog/approximate-nearest-neighbors/ [4] Kioxia Corporation. (2021). AiSAQ's All-in-Storage ANNS with Product Quantization. Retrieved from https://www.kioxia.com/global/business/ai/aisaq/ [5] The Linux Foundation. (n.d.). Open Source Program Office. Retrieved from https://opensource.org/ospo/ [6] Kioxia Corporation. (2020). Kioxia Generates the World's First AI-Designed Manga, Phaedo. Retrieved from https://www.kioxia.com/global/news/2020/20200929-1/
In the realm of data-and-cloud-computing and technology, Kioxia's AiSAQ open-source project is designed to enhance the inference capabilities of Retrieval-Augmented Generation (RAG) AI systems. The all-in-storage approach of AiSAQ, which operates directly on Solid State Drives (SSDs) via its Approximate Nearest Neighbor Search (ANNS) algorithm, promotes scalability and cost-effectiveness in these systems.