Amazon OpenSearch Service has been offering vector database capabilities to allow environment friendly vector similarity searches utilizing specialised k-nearest neighbor (k-NN) indexes to prospects since 2019. This performance has supported numerous use circumstances corresponding to semantic search, Retrieval Augmented Era (RAG) with giant language fashions (LLMs), and wealthy media looking out. With the explosion of AI capabilities and the growing creation of generative AI purposes, prospects are looking for vector databases with wealthy function units.
OpenSearch Service additionally presents a multi-tiered storage resolution to its prospects within the type of UltraWarm and Chilly tiers. UltraWarm gives cost-effective storage for less-active information with question capabilities, although with increased latency in comparison with scorching storage. Chilly tier presents even lower-cost archival storage for indifferent indexes that may be reattached when wanted. Shifting information to UltraWarm makes it immutable, which aligns effectively with use circumstances the place information updates are rare like log analytics.
Till now, there was a limitation the place UltraWarm or Chilly storage tiers couldn’t retailer k-NN indexes. As prospects undertake OpenSearch Service for vector use circumstances, we’ve noticed that they’re dealing with excessive prices resulting from reminiscence and storage turning into bottlenecks for his or her workloads.
To offer comparable cost-saving economics for bigger datasets, we are actually supporting k-NN indexes in each UltraWarm and Chilly tiers. This can allow you to avoid wasting prices, particularly for workloads the place:
A good portion of your vector information is accessed much less steadily (for instance, historic product catalogs, archived content material embeddings, or older doc repositories)
You want isolation between steadily and often accessed workloads, minimizing the necessity to scale scorching tier situations to assist forestall interference from indexes that may be moved to the nice and cozy tier
On this put up, we focus on this new functionality and its use circumstances, and supply a cost-benefit evaluation in numerous situations.
New functionality: Ok-NN indexes in UltraWarm and Chilly tiers
Now you can allow UltraWarm and Chilly tiers to your k-NN indexes from OpenSearch Service model 2.17 and up. This function is accessible for each new and present domains upgraded to model 2.17. Ok-NN indexes created after OpenSearch Service model 2.x are eligible for migration to heat and chilly tiers. Ok-NN indexes utilizing numerous forms of engines (FAISS, NMSLib, and Lucene) are eligible emigrate.
Use circumstances
This multi-tiered method to k-NN vector search advantages the next numerous use circumstances:
Lengthy-term semantic search – Keep searchability on years of historic textual content information for authorized, analysis, or compliance functions
Evolving AI fashions – Retailer embeddings from a number of variations of AI fashions, permitting comparisons and backward compatibility with out the price of protecting all information in scorching storage
Giant-scale picture and video similarity – Construct in depth libraries of visible content material that may be searched effectively, even because the dataset grows past the sensible limits of scorching storage
Ecommerce product suggestions – Retailer and search by way of huge product catalogs, shifting much less common or seasonal objects to cheaper tiers whereas sustaining search capabilities
Let’s discover real-world situations as an instance the potential value advantages of utilizing k-NN indexes with UltraWarm and Chilly storage tiers. We will likely be utilizing us-east-1 because the consultant AWS Area for these situations.
State of affairs 1: Balancing scorching and heat storage for combined workloads
Let’s say you’ve 100 million vectors of 768 dimensions (round 330 GB of uncooked vectors) unfold throughout 20 Lucene engine indexes of 5 million vectors every (roughly 16.5 GB), out of which 50% of information (about 10 indexes or 165 GB) is queried sometimes.
Area setup with out UltraWarm help
On this method, you prioritize most efficiency by protecting all the information in scorching storage, offering the quickest attainable question responses for the vectors. You deploy a cluster with 6x r6gd.4xlarge situations.
The month-to-month value for this setup involves $7,550 per 30 days with a knowledge occasion value of $6,700.
Though this gives top-tier efficiency for the queries, it may be over-provisioned given the combined entry patterns of your information.
Price-saving technique: UltraWarm area setup
On this method, you align your storage technique with the noticed entry patterns, optimizing for each efficiency and value. The new tier continues to supply optimum efficiency for steadily accessed information, whereas much less essential information strikes to UltraWarm storage.
Whereas UltraWarm queries expertise increased latency in comparison with scorching storage—this trade-off is commonly acceptable for much less steadily accessed information. Moreover, since UltraWarm information turns into immutable, this technique works greatest for secure datasets that don’t require any updates.
You retain the steadily accessed 50% of information (roughly 165 GB) in scorching storage, permitting you to scale back your scorching tier to 3x r6gd.4xlarge.search situations. For the much less steadily accessed 50% of information (roughly 165 GB), you introduce 2x ultrawarm1.medium.search situations as UltraWarm nodes. This tier presents a cheap resolution for information that doesn’t require absolutely the quickest entry occasions.
By tiering your information primarily based on entry patterns, you considerably cut back your scorching tier footprint whereas introducing a small heat tier for much less essential information. This technique lets you preserve excessive efficiency for frequent queries whereas optimizing prices for the complete system.
The new tier continues to supply optimum efficiency for almost all of queries focusing on steadily accessed information. For the nice and cozy tier, you see a rise in latency for queries on much less steadily accessed information, however that is mitigated by efficient caching on the UltraWarm nodes. General, the system maintains excessive availability and fault tolerance.
This balanced method reduces your month-to-month value to $5,350, with $3,350 for the new tier and $350 for the nice and cozy tier, lowering the month-to-month prices by roughly 29% total.
State of affairs 2: Managing Rising Vector Database with Entry-Based mostly Patterns
Think about your system processes and indexes huge quantities of content material (textual content, photographs, and movies), producing vector embeddings utilizing the Lucene engine for superior content material advice and similarity search. As your content material library grows, you’ve noticed clear entry patterns the place newer or common content material is queried steadily whereas older or much less common content material sees decreased exercise however nonetheless must be searchable.
To successfully leverage tiered storage in OpenSearch Service, contemplate organizing your information into separate indices primarily based on anticipated question patterns. This index-level group is essential as a result of information migration between tiers occurs on the index degree, permitting you to maneuver particular indices to cost-effective storage tiers as their entry patterns change.
Your present dataset consists of 150 GB of vector information, rising by 50 GB month-to-month as new content material is added. The information entry patterns present:
About 30% of your content material receives 70% of the queries, sometimes newer or common objects
One other 30% sees reasonable question quantity
The remaining 40% is accessed sometimes however should stay searchable for completeness and occasional deep evaluation
Given these traits, let’s discover a single-tiered and multi-tiered method to managing this rising dataset effectively.
Single-tiered configuration
For a single-tiered configuration, because the dataset expands, the vector information will develop to be round 400 GB over 6 months, all saved in a scorching (default) tier. Within the case of r6gd.8xlarge.search situations, the info occasion depend can be round 3 nodes.
The general month-to-month prices for the area underneath a single-tiered setup can be round $8050 with a knowledge occasion value of round $6700.
Multi-tiered configuration
To optimize efficiency and value, you implement a multi-tiered storage technique utilizing Index State Administration (ISM) insurance policies to automate the motion of indices between tiers as entry patterns evolve:
Scorching tier – Shops steadily accessed indices for quickest entry
Heat tier – Homes reasonably accessed indices with increased latency
Chilly tier – Archives hardly ever accessed indices for cost-effective long-term retention
For the info distribution, you begin with a complete of 150 GB with a month-to-month development of fifty GB. The next is the projected information distribution when the info reaches 400 GB at across the 6 month mark:
Scorching tier – Roughly 100 GB (most steadily queried content material) on 1x r6gd.8xlarge
Heat Tier – Roughly 100 GB (reasonably accessed content material) on 2x ultrawarm1.medium.search
Chilly Tier – Roughly 200 GB (hardly ever accessed content material)
Beneath the multi-tiered setup, the price for the vector information area totals $3880, together with $2330 value of information nodes, $350 value of UltraWarm nodes, and $5.00 of chilly storage prices.
You see compute financial savings as the new tier occasion dimension lowered by round 66%. Your total value financial savings have been round 50% year-over-year with multi-tiered domains.
State of affairs 3: Giant-scale disk-based vector search with UltraWarm
Let’s contemplate a system managing 1 billion vectors of 768 dimensions distributed throughout 100 indexes of 10 million vectors every. The system predominantly makes use of disk-based vector search with 32x FAISS quantization for value optimization, and about 70% of queries goal 30% of the info, making it a really perfect candidate for tiered storage.
Area setup with out UltraWarm help
On this method, utilizing disk-based vector search to deal with the large-scale information, you deploy a cluster with 4x r6gd.4xlarge situations. This setup gives satisfactory storage capability whereas optimizing reminiscence utilization by way of disk-based search.
The month-to-month value for this setup involves $6,500 per 30 days with a knowledge occasion value of $4,470.
Price-saving technique: UltraWarm area setup
On this method, you align your storage technique with the noticed question patterns, much like State of affairs 1.
You retain the steadily accessed 30% of information in scorching storage, utilizing 1x r6gd.4xlarge situations. For the much less steadily accessed 70% of information, you employ 2x ultrawarm1.medium.search situations.
You utilize disk-based vector search in each storage tiers to optimize reminiscence utilization. This balanced method reduces your month-to-month value to $3,270, with $1,120 for the new tier and $400 for the nice and cozy tier, lowering the month-to-month prices by roughly 50% total.
Get began with UltraWarm and Chilly storage
To make the most of k-NN indexes in UltraWarm and Chilly tiers, be sure that your area is working OpenSearch Service 2.17 or later. For directions emigrate k-NN indexes throughout storage tiers, seek advice from UltraWarm storage for Amazon OpenSearch Service.
Think about the next greatest practices for multi-tiered vector search:
Analyze your question patterns to optimize information placement throughout tiers
Use Index State Administration (ISM) to handle the info lifecycle throughout tiers transparently
Monitor cache hit charges utilizing the k-NN stats and alter tiering and node sizing as wanted
Abstract
The introduction of k-NN vector search capabilities in UltraWarm and Chilly tiers for OpenSearch Service marks a big step ahead in offering cost-effective, scalable options for vector search workloads. This function lets you steadiness efficiency and value by protecting steadily accessed information in scorching storage for lowest latency, whereas shifting much less lively information to UltraWarm for value financial savings. Whereas UltraWarm storage introduces some efficiency trade-offs and makes information immutable, these traits typically align effectively with real-world entry patterns the place older information sees fewer queries and updates.
We encourage you to guage your present vector search workloads and contemplate how this multi-tier method may gain advantage your use circumstances. As AI and machine studying proceed to evolve, we stay dedicated to enhancing our providers to fulfill your rising wants.
Keep tuned for future updates as we proceed to innovate and develop the capabilities of vector search in OpenSearch Service.
In regards to the Authors
Kunal Kotwani is a software program engineer at Amazon Net Companies, specializing in OpenSearch core and vector search applied sciences. His main contributions embody creating storage optimization options for each native and distant storage techniques that assist prospects run their search workloads extra cost-effectively.
Navneet Verma is a senior software program engineer at AWS OpenSearch . His major pursuits embody machine studying, serps and enhancing search relevancy. Exterior of labor, he enjoys taking part in badminton.
Sorabh Hamirwasia is a senior software program engineer at AWS engaged on the OpenSearch Undertaking. His major curiosity embody constructing value optimized and performant distributed techniques.