Knn query
Finds the k nearest vectors to a query vector, as measured by a similarity metric. knn query finds nearest vectors through approximate search on indexed dense_vectors. The preferred way to do approximate kNN search is through the top level knn section of a search request. knn query is reserved for expert cases, where there is a need to combine this query with other queries, or perform a kNN search against a semantic_text field.
PUT my-image-index
{
"mappings": {
"properties": {
"image-vector": {
"type": "dense_vector",
"dims": 3,
"index": true,
"similarity": "l2_norm"
},
"file-type": {
"type": "keyword"
},
"title": {
"type": "text"
}
}
}
}
Index your data.
POST my-image-index/_bulk?refresh=true{ "index": { "_id": "1" } } { "image-vector": [1, 5, -20], "file-type": "jpg", "title": "mountain lake" } { "index": { "_id": "2" } } { "image-vector": [42, 8, -15], "file-type": "png", "title": "frozen lake"} { "index": { "_id": "3" } } { "image-vector": [15, 11, 23], "file-type": "jpg", "title": "mountain lake lodge" }Run the search using the
knnquery, asking for the top 10 nearest vectors from each shard, and then combine shard results to get the top 3 global results.POST my-image-index/_search{ "size" : 3, "query" : { "knn": { "field": "image-vector", "query_vector": [-5, 9, -12], "k": 10 } } }You can also provide a hex-encoded query vector string. Hex query vectors are byte-oriented (one byte per dimension, represented as two hex characters). For example, [-5, 9, -12]as signed bytes isfb09f4.POST my-image-index/_search{ "size" : 3, "query" : { "knn": { "field": "image-vector", "query_vector": "fb09f4", "k": 10 } } }You can also provide a base64-encoded query vector string. For example, [-5, 9, -12]encoded as float32 big-endian bytes iswKAAAEEQAADBQAAA.POST my-image-index/_search{ "size" : 3, "query" : { "knn": { "field": "image-vector", "query_vector": "wKAAAEEQAADBQAAA", "k": 10 } } }
field- (Required, string) The name of the vector field to search against. Must be a
dense_vectorfield with indexing enabled, or asemantic_textfield with a compatible dense vector inference model. -
query_vector -
(Optional, array of floats or string) Query vector. Must have the same number of dimensions as the vector field you are searching against. Must be one of:
- An array of floats
- A hex-encoded byte vector (one byte per dimension; for
bit, one byte per 8 dimensions). - A base64-encoded vector string. Base64 supports
floatandbfloat16(big-endian),byte, andbitencodings depending on the target field type.Either this or query_vector_buildermust be provided.
-
query_vector_builder - (Optional, object) Query vector builder. A configuration object indicating how to build a query vector before executing the request. You must provide either a
query_vector_builderorquery_vector, but not both. Refer to Query vector builder types for parameter details and Query vector builder examples for usage examples. k- (Optional, integer) The number of nearest neighbors to return from each shard. Elasticsearch collects
k(ork * oversampleif conditions forrescore_vectorare met) results from each shard, then merges them to find the global topkresults. This value must be less than or equal tonum_candidates. Defaults to search request size. num_candidates- (Optional, integer) The number of nearest neighbor candidates to consider per shard while doing knn search. Cannot exceed 10,000. Increasing
num_candidatestends to improve the accuracy of the final results. Defaults to1.5 * kifkis set, or1.5 * sizeifkis not set. Whenrescore_vectorare met) is applied,num_candidatesis set tomax(num_candidates, k * oversample) visit_percentage- (Optional, float) The percentage of vectors to explore per shard while doing knn search with
bbq_disk. Must be between 0 and 100. 0 will default to usingnum_candidatesfor calculating the percent visited. Increasingvisit_percentagetends to improve the accuracy of the final results. Ifvisit_percentageis set forbbq_disk,num_candidatesis ignored. Defaults to ~1% per shard for every 1 million vectors. filter- (Optional, query object) Query to filter the documents that can match. The kNN search will return the top documents that also match this filter. The value can be a single query or a list of queries. If
filteris not provided, all documents are allowed to match.
The filter is a pre-filter, meaning that it is applied during the approximate kNN search to ensure that num_candidates matching documents are returned.
similarity- (Optional, float) The minimum similarity required for a document to be considered a match. The similarity value calculated relates to the raw
similarityused. Not the document score. The matched documents are then scored according tosimilarityand the providedboostis applied. boost- (Optional, float) Floating point number used to multiply the scores of matched documents. This value cannot be negative. Defaults to
1.0. _name- (Optional, string) Name field to identify the query
rescore_vector-
(Optional, object) Apply oversampling and rescoring to quantized vectors.
Parameters for
rescore_vector:oversample- (Required, float)
Applies the specified oversample factor to
kon the approximate kNN search. The approximate kNN search will:- Retrieve
num_candidatescandidates per shard. - From these candidates, the top
k * oversamplecandidates per shard will be rescored using the original vectors. - The top
krescored candidates will be returned. Must be one of the following values:- >= 1f to indicate the oversample factor
- Exactly
0to indicate that no oversampling and rescoring should occur.
For more information, refer to oversampling and rescoring quantized vectors.
NoteRescoring only makes sense for quantized vectors. The
rescore_vectoroption will be ignored for non-quantizeddense_vectorfields, because the original vectors are used for scoring.
There are two ways to filter documents that match a kNN query:
- pre-filtering – filter is applied during the approximate kNN search to ensure that
kmatching documents are returned. - post-filtering – filter is applied after the approximate kNN search completes, which results in fewer than k results, even when there are enough matching documents.
Pre-filtering is supported through the filter parameter of the knn query. Also filters from aliases are applied as pre-filters.
All other filters found in the Query DSL tree are applied as post-filters. For example, knn query finds the top 3 documents with the nearest vectors (k=3), which are combined with term filter, that is post-filtered. The final set of documents will contain only a single document that passes the post-filter.
POST my-image-index/_search
{
"size" : 10,
"query" : {
"bool" : {
"must" : {
"knn": {
"field": "image-vector",
"query_vector": [-5, 9, -12],
"k": 3
}
},
"filter" : {
"term" : { "file-type" : "png" }
}
}
}
}
Knn query can be used as a part of hybrid search, where knn query is combined with other lexical queries. For example, the query below finds documents with title matching mountain lake, and combines them with the top 10 documents that have the closest image vectors to the query_vector. The combined documents are then scored and the top 3 top scored documents are returned.
POST my-image-index/_search
{
"size" : 3,
"query": {
"bool": {
"should": [
{
"match": {
"title": {
"query": "mountain lake",
"boost": 1
}
}
},
{
"knn": {
"field": "image-vector",
"query_vector": [-5, 9, -12],
"k": 10,
"boost": 2
}
}
]
}
}
}
The knn query can be used inside a nested query. The behaviour here is similar to top level nested kNN search:
- kNN search over nested
dense_vectors diversifies the top results over the top-level document filterboth over the top-level document metadata andnestedis supported and acts as a pre-filter
To ensure correct results: each individual filter must be either over:
- Top-level metadata
nestedmetadataNoteA single knn query supports multiple filters, where some filters can be over the top-level metadata and some over nested.
This query performs a basic nested knn search:
{
"query" : {
"nested" : {
"path" : "paragraph",
"query" : {
"knn": {
"query_vector": [0.45, 0.50],
"field": "paragraph.vector"
}
}
}
}
}
This query filters over nested metadata. For scoring parent documents, this query only considers vectors that have "paragraph.language" set to "EN":
{
"query" : {
"nested" : {
"path" : "paragraph",
"query" : {
"knn": {
"query_vector": [0.45, 0.50],
"field": "paragraph.vector",
"filter": {
"match": {
"paragraph.language": "EN"
}
}
}
}
}
}
}
This query uses multiple filters: one over nested metadata and another over the top level metadata. For scoring parent documents, this query only considers vectors whose parent's title contain "essay" word and have "paragraph.language" set to "EN":
{
"query" : {
"nested" : {
"path" : "paragraph",
"query" : {
"knn": {
"query_vector": [0.45, 0.50],
"field": "paragraph.vector",
"filter": [
{
"match": {
"paragraph.language": "EN"
}
},
{
"match": {
"title": "essay"
}
}
]
}
}
}
}
}
Note that nested knn only supports score_mode=max.
Elasticsearch supports knn queries over a
semantic_text field.
Here is an example using the query_vector_builder:
{
"query": {
"knn": {
"field": "inference_field",
"k": 10,
"num_candidates": 100,
"query_vector_builder": {
"text_embedding": {
"model_text": "test"
}
}
}
}
}
Note that for semantic_text fields, the model_id does not have to be
provided as it can be inferred from the semantic_text field mapping.
Knn search using query vectors over semantic_text fields is also supported,
with no change to the API.
Query vector builders let you generate vectors directly from inputs such as text or base64-encoded images at search time.
Elasticsearch provides three query vector builders. Each builder generates a query vector from a different type of input or source.
text_embedding: Generates a query vector from text input. This is useful when your application sends raw text, such as a search query, and you want Elasticsearch to convert it into an embedding automatically instead of generating the vector in advance.embedding:Generates a query vector from multimodal input, such as text or base64-encoded images. Use this when you want to generate embeddings dynamically from different types of input without creating them in advance. lookup:Retrieves an existing vector from a stored document to use as the query vector. This is useful when you want to find documents similar to an existing document, without generating a new embedding at search time.
Refer to Query vector builder types for parameter details and Query vector builder examples for usage examples.
-
lookup -
Build the query vector by looking up an existing document's vector. For an example, refer to
lookup.Parameters for
lookup:id: (Required, string) The ID of the document to look up.path: (Required, string) The name of the vector field in the document to use as the query vector.index: (Required, string) The name of the index containing the document to look uprouting: (Optional, string) The routing value to use when looking up the document. -
text_embedding -
Build the query vector by generating an embedding from input text. For an example, refer to
text_embedding.Parameters for
text_embedding:model_id: (Optional, string) Identifier of the text embedding model that generates the query vector. Use the same model that produced vectors in your index.
When you query only semantic_text fields, you can omit model_id because Elasticsearch uses the inference_id from the semantic_text field mapping (for example the search-time inference endpoint configured on the field).
For dense_vector fields or when you need a different model than the one mapped on semantic_text, set model_id explicitly.
model_text
: (Required, string) The query text passed to the model to produce the embedding.
For an example request, refer to text_embedding. For a broader overview of semantic kNN search, refer to Perform semantic search.
-
embedding -
Build the query vector by generating an embedding from text or a base64-encoded image. This enables multimodal search, where different types of input can be used to generate a vector and retrieve similar documents. For an example, refer to
embedding.Parameters for
embedding:inference_id: (Required, string) The ID of the inference service used to generate the embedding. Must reference an inference service configured with theembeddingtask type.input: (Required, string, object, or array) The input used to generate the query vector. You can provide the input in one of the following formats:- Single object: A single multimodal input (text or image). For an example, refer to single input object.
- Array of objects: Multiple inputs. These are combined into a single embedding. For an example, refer to multiple inputs.
- String: A single text input. Equivalent to
valuewhen providing an object with "type": "text". For an example, refer to string input.
Parameters for `input` (when `input` is an object or an array of objects)
type- (Required, string) The type of the input. For text input, use
text. For image input, useimage. format- (Optional, string) The format of the input value. If omitted, the default for the
typeis used. For text input, specifytextor omit (defaults totext). For image input, specifybase64or omit (defaults tobase64). value(Required, string) The value to generate the embedding from. For text input, a text string. For image input, must be a data URI for a base64-encoded image.
timeout
: (Optional, time value) Maximum time to wait for the embedding inference request to complete. Defaults to 30s when omitted.
Use the lookup query vector builder to retrieve an existing vector from a stored document and use it as the query vector.
{
"knn": {
"field": "dense-vector-field",
"k": 10,
"num_candidates": 100,
"query_vector_builder": {
"lookup": {
"index": "my-index",
"id": "document-1",
"path": "my_vector"
}
}
}
}
- The name of the index containing the document to look up.
- The ID of the document to look up.
- The name of the vector field in the document to use as the query vector.
Use the text_embedding query vector builder to generate a query vector from text input.
POST my-index/_search
{
"knn": {
"field": "dense-vector-field",
"k": 10,
"num_candidates": 100,
"query_vector_builder": {
"text_embedding": {
"model_id": "my-text-embedding-model",
"model_text": "The opposite of blue"
}
}
}
}
- The ID of the text embedding model or deployment in Elasticsearch that generates the query vector. Use the same model that produced vectors in your index. When you query only
semantic_textfields, you can omitmodel_idbecause Elasticsearch uses theinference_idfrom thesemantic_textfield mapping (for example the search-time inference endpoint configured on the field). - The query text passed to the model to produce the embedding.
Use the embedding query vector builder to generate a query vector from multimodal input.
This builder supports both text and base64-encoded image inputs. You can also use multiple inputs to generate a query vector, enabling multimodal search scenarios such as searching with both text and an image.
POST my-index/_search
{
"knn": {
"field": "dense-vector-field",
"k": 10,
"num_candidates": 100,
"query_vector_builder": {
"embedding": {
"inference_id": "my-embedding-endpoint",
"input": {
"type": "image",
"format": "base64",
"value": "data:image/jpeg;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA\nAAAAFCAIAAAACDbGyAAAAHElEQVQI12P4\n//8/w38GIAXDIBKE0DHxgljNBAAO\n9TXL0Y4OHwAAAABJRU5ErkJggg=="
}
}
}
}
}
- The ID of the inference endpoint used to generate the embedding. This must reference an inference service configured with the
EMBEDDINGtask type. - The type of the input. Valid values are
textandimage. - The format of the input value. Defaults to
textfor text input. Defaults tobase64for image input. - The value used to generate the embedding. For image, this must be a data URI for a base64-encoded image. The example above is a small sample base64 string. In real usage, this would be a much longer string generated from an actual image file.
POST my-index/_search
{
"knn": {
"field": "dense-vector-field",
"k": 10,
"num_candidates": 100,
"query_vector_builder": {
"embedding": {
"inference_id": "my-embedding-endpoint",
"input": [
{
"type": "text",
"value": "red shoes"
},
{
"type": "image",
"format": "base64",
"value": "data:image/jpeg;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA\nAAAAFCAIAAAACDbGyAAAAHElEQVQI12P4\n//8/w38GIAXDIBKE0DHxgljNBAAO\n9TXL0Y4OHwAAAABJRU5ErkJggg=="
}
]
}
}
}
}
- The ID of the inference endpoint used to generate the embedding. This must reference an inference service configured with the
EMBEDDINGtask type. - The format of the input value. Defaults to
base64for image input. - The value used to generate the embedding. For image, this must be a data URI for a base64-encoded image. The example above is a small sample base64 string. In real usage, this would be a much longer string generated from an actual image file.
POST my-index/_search
{
"knn": {
"field": "dense-vector-field",
"k": 10,
"num_candidates": 100,
"query_vector_builder": {
"embedding": {
"inference_id": "my-embedding-endpoint",
"input": "red shoes"
}
}
}
}
- The ID of the inference endpoint used to generate the embedding. This must reference an inference service configured with the
EMBEDDINGtask type. - A shorthand for a single text input. Equivalent to:
{
"type": "text",
"value": "red shoes"
}
knn query calculates aggregations on top k documents from each shard. Thus, the final results from aggregations contain k * number_of_shards documents. This is different from the top level knn section where aggregations are calculated on the global top k nearest documents.