此版本仍在开发中，尚未被视为稳定版。为了获取最新的快照版本，请使用Spring AI 1.1.3！spring-doc.cadn.net.cn

向量数据库

向量数据库是一种专门的数据库类型，在人工智能应用中发挥着至关重要的作用。spring-doc.cadn.net.cn

在向量数据库中，查询与传统关系型数据库不同。它们不进行精确匹配，而是执行相似性搜索。当给定一个向量作为查询时，向量数据库会返回与该查询向量“相似”的向量。关于如何在高层次上计算这种相似性的更多详细信息，请参阅向量相似性。spring-doc.cadn.net.cn

向量数据库用于将您的数据与AI模型集成。使用它们的第一步是将您的数据加载到向量数据库中。然后，当用户查询要发送给AI模型时，会首先检索出一组相似的文档。这些文档随后作为用户问题的上下文，并与用户的查询一起发送给AI模型。这种技术被称为检索增强生成（RAG）。spring-doc.cadn.net.cn

以下各节将介绍用于使用多种向量数据库实现的 Spring AI 接口，以及一些高级示例用法。spring-doc.cadn.net.cn

最后一节旨在揭示向量数据库中相似性搜索的底层方法。spring-doc.cadn.net.cn

API概述

本节作为 Spring AI 框架中 VectorStore 接口及其相关类的指南。spring-doc.cadn.net.cn

Spring AI 提供了一个抽象化的 API，用于通过 VectorStore 接口及其只读对应接口 VectorStoreRetriever 与向量数据库进行交互。spring-doc.cadn.net.cn

向量存储检索器接口

Spring AI 提供了一个只读接口，称为 VectorStoreRetriever,该接口仅暴露文档检索功能：spring-doc.cadn.net.cn

@FunctionalInterface
public interface VectorStoreRetriever {

    List<Document> similaritySearch(SearchRequest request);

    default List<Document> similaritySearch(String query) {
        return this.similaritySearch(SearchRequest.builder().query(query).build());
    }
}

该函数式接口专为仅需从向量存储中检索文档而无需执行任何修改操作的用例设计。它遵循最小权限原则，仅暴露文档检索所需的必要功能。spring-doc.cadn.net.cn

向量存储接口

The VectorStore interface extends VectorStoreRetriever and adds mutation capabilities:spring-doc.cadn.net.cn

public interface VectorStore extends DocumentWriter, VectorStoreRetriever {

    default String getName() {
		return this.getClass().getSimpleName();
	}

    void add(List<Document> documents);

    void delete(List<String> idList);

    void delete(Filter.Expression filterExpression);

    default void delete(String filterExpression) { ... }

    default <T> Optional<T> getNativeClient() {
		return Optional.empty();
	}
}

The VectorStore interface combines both read and write operations, allowing you to add, delete, and search for documents in a vector database.spring-doc.cadn.net.cn

搜索请求构建器

public class SearchRequest {

	public static final double SIMILARITY_THRESHOLD_ACCEPT_ALL = 0.0;

	public static final int DEFAULT_TOP_K = 4;

	private String query = "";

	private int topK = DEFAULT_TOP_K;

	private double similarityThreshold = SIMILARITY_THRESHOLD_ACCEPT_ALL;

	@Nullable
	private Filter.Expression filterExpression;

    public static Builder from(SearchRequest originalSearchRequest) {
		return builder().query(originalSearchRequest.getQuery())
			.topK(originalSearchRequest.getTopK())
			.similarityThreshold(originalSearchRequest.getSimilarityThreshold())
			.filterExpression(originalSearchRequest.getFilterExpression());
	}

	public static class Builder {

		private final SearchRequest searchRequest = new SearchRequest();

		public Builder query(String query) {
			Assert.notNull(query, "Query can not be null.");
			this.searchRequest.query = query;
			return this;
		}

		public Builder topK(int topK) {
			Assert.isTrue(topK >= 0, "TopK should be positive.");
			this.searchRequest.topK = topK;
			return this;
		}

		public Builder similarityThreshold(double threshold) {
			Assert.isTrue(threshold >= 0 && threshold <= 1, "Similarity threshold must be in [0,1] range.");
			this.searchRequest.similarityThreshold = threshold;
			return this;
		}

		public Builder similarityThresholdAll() {
			this.searchRequest.similarityThreshold = 0.0;
			return this;
		}

		public Builder filterExpression(@Nullable Filter.Expression expression) {
			this.searchRequest.filterExpression = expression;
			return this;
		}

		public Builder filterExpression(@Nullable String textExpression) {
			this.searchRequest.filterExpression = (textExpression != null)
					? new FilterExpressionTextParser().parse(textExpression) : null;
			return this;
		}

		public SearchRequest build() {
			return this.searchRequest;
		}

	}

	public String getQuery() {...}
	public int getTopK() {...}
	public double getSimilarityThreshold() {...}
	public Filter.Expression getFilterExpression() {...}
}

要将数据插入向量数据库，需将其封装在Document对象中。 Document类封装来自数据源的内容，例如PDF或Word文档，并包含以字符串形式表示的文本。它还包含键值对形式的元数据，包括文件名等详细信息。spring-doc.cadn.net.cn

在插入向量数据库时，文本内容会通过嵌入模型转换为一个数值数组，即float[],称为向量嵌入。嵌入模型，例如Word2Vec、GLoVE和BERT,或OpenAI的text-embedding-ada-002,用于将单词、句子或段落转换为这些向量嵌入。spring-doc.cadn.net.cn

向量数据库的作用是存储并促进这些嵌入的相似性搜索。它本身并不生成嵌入。要创建向量嵌入，应使用EmbeddingModel。spring-doc.cadn.net.cn

接口中的similaritySearch方法允许根据给定的查询字符串检索文档。可以通过使用以下参数对这些方法进行微调：spring-doc.cadn.net.cn

k: 一个整数，用于指定要返回的相似文档的最大数量。这通常被称为“前 K”搜索，或“K 最近邻”（KNN）。spring-doc.cadn.net.cn
threshold: 一个介于0到1之间的双精度值，数值越接近1表示相似度越高。默认情况下，如果您设置阈值为0.75，例如，则仅返回相似度高于该值的文档。spring-doc.cadn.net.cn
Filter.Expression: 一个用于传递流畅的DSL（领域特定语言）表达式的类，其功能类似于SQL中的“where”子句，但仅适用于Document的元数据键值对。spring-doc.cadn.net.cn
filterExpression: 一个基于 ANTLR4 的外部 DSL，接受过滤器表达式作为字符串。例如，对于诸如 country、year 和 isActive 等元数据键，您可以使用如下表达式：country == 'UK' && year >= 2020 && isActive == true.spring-doc.cadn.net.cn

在Filter.Expression中查找更多关于元数据筛选器的信息。spring-doc.cadn.net.cn

架构初始化

某些向量存储在使用前需要初始化其后端模式。默认情况下不会为您初始化。您必须选择加入，通过为相应的构造函数参数传递 boolean；或者，如果使用 Spring Boot，则在 application.properties 或 application.yml 中将适当的 initialize-schema 属性设置为 true。请查阅您正在使用的向量存储的文档以获取具体的属性名称。spring-doc.cadn.net.cn

批处理策略

在使用向量存储时，通常需要对大量文档进行嵌入。虽然看似只需调用一次即可一次性完成所有文档的嵌入，但这种做法可能会引发问题。嵌入模型会将文本处理为标记，并且存在一个最大标记限制，通常称为上下文窗口大小。这一限制会制约单次嵌入请求所能处理的文本量。如果尝试在一次调用中嵌入过多的标记，可能会导致错误或嵌入被截断。spring-doc.cadn.net.cn

为了解决这一标记限制，Spring AI 实现了批处理策略。这种方法将大型文档集拆分为较小的批次，使其能够适应嵌入模型的最大上下文窗口。批处理不仅解决了标记限制问题，还能提升性能并更高效地利用 API 速率限制。spring-doc.cadn.net.cn

Spring AI 通过 BatchingStrategy 接口提供此功能，该接口允许根据文档的标记数量将其分批处理。spring-doc.cadn.net.cn

核心BatchingStrategy接口定义如下：spring-doc.cadn.net.cn

public interface BatchingStrategy {
    List<List<Document>> batch(List<Document> documents);
}

此接口定义了一个单方法 batch,该方法接收一个文档列表并返回一个文档批次列表。spring-doc.cadn.net.cn

默认实现

Spring AI 提供了一个名为 TokenCountBatchingStrategy 的默认实现。该策略根据文档的标记数量对文档进行分批处理，确保每个批次不超过计算出的最大输入标记数量。spring-doc.cadn.net.cn

TokenCountBatchingStrategy 的关键特性：spring-doc.cadn.net.cn

使用 OpenAI 的最大输入标记数（8191）作为默认上限。spring-doc.cadn.net.cn
包含一个预留百分比（默认为10%），以提供潜在开销的缓冲。spring-doc.cadn.net.cn
计算实际的最大输入标记数为：actualMaxInputTokenCount = originalMaxInputTokenCount * (1 - RESERVE_PERCENTAGE)spring-doc.cadn.net.cn

该策略会估算每份文档的标记数，将其分组为批次，且每个批次的标记数不超过最大输入标记数；如果单份文档的标记数超过此限制，则抛出异常。spring-doc.cadn.net.cn

您也可以自定义TokenCountBatchingStrategy,以更好地满足您的特定需求。这可以通过在 Spring Boot @Configuration 类中创建一个带有自定义参数的新实例来实现。spring-doc.cadn.net.cn

以下是一个如何创建自定义 TokenCountBatchingStrategy Bean 的示例：spring-doc.cadn.net.cn

@Configuration
public class EmbeddingConfig {
    @Bean
    public BatchingStrategy customTokenCountBatchingStrategy() {
        return new TokenCountBatchingStrategy(
            EncodingType.CL100K_BASE,  // Specify the encoding type
            8000,                      // Set the maximum input token count
            0.1                        // Set the reserve percentage
        );
    }
}

在此配置中：spring-doc.cadn.net.cn

EncodingType.CL100K_BASE: 指定用于分词的编码类型。此编码类型由JTokkitTokenCountEstimator使用，以准确估计词元数量。spring-doc.cadn.net.cn
8000: 设置最大输入标记数。该值应小于或等于您的嵌入模型的最大上下文窗口大小。spring-doc.cadn.net.cn
0.1: 设置保留百分比。从最大输入标记数中保留的标记百分比。这会在处理过程中为潜在的标记数增加创建一个缓冲区。spring-doc.cadn.net.cn

默认情况下，此构造函数使用 Document.DEFAULT_CONTENT_FORMATTER 进行内容格式化，使用 MetadataMode.NONE 进行元数据处理。如果您需要自定义这些参数，可以使用带有附加参数的完整构造函数。spring-doc.cadn.net.cn

一旦定义，这个自定义的TokenCountBatchingStrategy bean 将会自动被您应用程序中的EmbeddingModel实现使用，以替换默认策略。spring-doc.cadn.net.cn

The TokenCountBatchingStrategy internally uses a TokenCountEstimator (specifically, JTokkitTokenCountEstimator) to calculate token counts for efficient batching. This ensures accurate token estimation based on the specified encoding type.spring-doc.cadn.net.cn

此外，TokenCountBatchingStrategy 提供了灵活性，允许您传入自己的 TokenCountEstimator 接口实现。此功能使您可以使用根据自身需求量身定制的自定义Tokens计数策略。例如：spring-doc.cadn.net.cn

TokenCountEstimator customEstimator = new YourCustomTokenCountEstimator();
TokenCountBatchingStrategy strategy = new TokenCountBatchingStrategy(
		this.customEstimator,
    8000,  // maxInputTokenCount
    0.1,   // reservePercentage
    Document.DEFAULT_CONTENT_FORMATTER,
    MetadataMode.NONE
);

使用自动截断

一些嵌入模型，例如 Vertex AI 文本嵌入，支持一个auto_truncate功能。启用时，该模型会静默截断超过最大长度的文本输入并继续处理；禁用时，则会对过大的输入抛出明确的错误。spring-doc.cadn.net.cn

当使用带有批处理策略的自动截断时，您必须将批处理策略的输入标记数配置为远高于模型的实际最大值。这样可以防止批处理策略因大文档而抛出异常，从而使嵌入模型能够在内部处理截断。spring-doc.cadn.net.cn

自动截断的配置

启用自动截断时，请将批处理策略的最大输入标记数设置为远高于模型的实际限制。这样可以防止批处理策略因大文档而抛出异常，从而使嵌入模型能够在内部处理截断。spring-doc.cadn.net.cn

以下是使用 Vertex AI 并启用自动截断以及自定义 BatchingStrategy 的示例配置，然后在 PgVectorStore 中使用它们：spring-doc.cadn.net.cn

@Configuration
public class AutoTruncationEmbeddingConfig {

    @Bean
    public VertexAiTextEmbeddingModel vertexAiEmbeddingModel(
            VertexAiEmbeddingConnectionDetails connectionDetails) {

        VertexAiTextEmbeddingOptions options = VertexAiTextEmbeddingOptions.builder()
                .model(VertexAiTextEmbeddingOptions.DEFAULT_MODEL_NAME)
                .autoTruncate(true)  // Enable auto-truncation
                .build();

        return new VertexAiTextEmbeddingModel(connectionDetails, options);
    }

    @Bean
    public BatchingStrategy batchingStrategy() {
        // Only use a high token limit if auto-truncation is enabled in your embedding model.
        // Set a much higher token count than the model actually supports
        // (e.g., 132,900 when Vertex AI supports only up to 20,000)
        return new TokenCountBatchingStrategy(
                EncodingType.CL100K_BASE,
                132900,  // Artificially high limit
                0.1      // 10% reserve
        );
    }

    @Bean
    public VectorStore vectorStore(JdbcTemplate jdbcTemplate, EmbeddingModel embeddingModel, BatchingStrategy batchingStrategy) {
        return PgVectorStore.builder(jdbcTemplate, embeddingModel)
            // other properties omitted here
            .build();
    }
}

在此配置中：spring-doc.cadn.net.cn

嵌入模型已启用自动截断功能，使其能够优雅地处理过大的输入。spring-doc.cadn.net.cn
批处理策略使用了一个人为设定的高标记限制（132,900），远高于实际模型的限制（20,000）。spring-doc.cadn.net.cn
向量存储使用配置的嵌入模型和自定义的BatchingStrategy bean。spring-doc.cadn.net.cn

为什么这有效

这种方法之所以有效，是因为：spring-doc.cadn.net.cn

The TokenCountBatchingStrategy checks if any single document exceeds the configured maximum and throws an IllegalArgumentException if it does.spring-doc.cadn.net.cn
通过在批处理策略中设置一个非常高的限制，我们确保此检查永远不会失败。spring-doc.cadn.net.cn
超过模型限制的文档或批次将被静默截断，并由嵌入模型的自动截断功能进行处理。spring-doc.cadn.net.cn

最佳实践

当使用自动截断时：spring-doc.cadn.net.cn

将批处理策略的最大输入标记数设置为至少是模型实际限制的 5 到 10 倍，以避免因批处理策略过早触发异常。spring-doc.cadn.net.cn
监控嵌入模型的截断警告日志（注意：并非所有模型都会记录截断事件）。spring-doc.cadn.net.cn
请考虑静默截断对您的嵌入质量的影响。spring-doc.cadn.net.cn
使用示例文档进行测试，以确保截断后的嵌入仍然符合您的要求。spring-doc.cadn.net.cn
请为未来的维护人员记录此配置，因为它是非标准的。spring-doc.cadn.net.cn

虽然自动截断可以防止错误，但它可能导致嵌入不完整。长文档末尾的重要信息可能会丢失。如果您的应用程序要求将所有内容都进行嵌入，请在嵌入之前将文档拆分为更小的片段。

Spring Boot 自动配置

如果您正在使用 Spring Boot 自动配置，您必须提供一个自定义的 BatchingStrategy Bean 来覆盖 Spring AI 默认提供的那个：spring-doc.cadn.net.cn

@Bean
public BatchingStrategy customBatchingStrategy() {
    // This bean will override the default BatchingStrategy
    return new TokenCountBatchingStrategy(
            EncodingType.CL100K_BASE,
            132900,  // Much higher than model's actual limit
            0.1
    );
}

该 Bean 在您的应用程序上下文中存在时，将自动替换所有向量存储使用的默认批处理策略。spring-doc.cadn.net.cn

自定义实现

虽然 TokenCountBatchingStrategy 提供了一个强大的默认实现，但您可以根据具体需求自定义批处理策略。这可以通过 Spring Boot 的自动配置来完成。spring-doc.cadn.net.cn

要自定义批处理策略，请在您的 Spring Boot 应用程序中定义一个 BatchingStrategy Bean：spring-doc.cadn.net.cn

@Configuration
public class EmbeddingConfig {
    @Bean
    public BatchingStrategy customBatchingStrategy() {
        return new CustomBatchingStrategy();
    }
}

这个自定义的BatchingStrategy随后将被您应用程序中的EmbeddingModel实现自动使用。spring-doc.cadn.net.cn

Spring AI 支持的向量存储已配置为使用默认值 TokenCountBatchingStrategy。 SAP Hana 向量存储目前未配置为批量处理。

向量存储实现

这些是 VectorStore 接口的可用实现：spring-doc.cadn.net.cn

Azure 向量搜索 - Azure 向量存储。spring-doc.cadn.net.cn
Apache Cassandra - 该 Apache Cassandra 向量存储。spring-doc.cadn.net.cn
Chroma向量存储 - Chroma 向量存储。spring-doc.cadn.net.cn
Elasticsearch 向量存储 - Elasticsearch 的向量存储。spring-doc.cadn.net.cn
GemFire 向量存储 - GemFire 向量存储。spring-doc.cadn.net.cn
MariaDB 向量存储 - MariaDB 的向量存储。spring-doc.cadn.net.cn
Milvus 向量存储 - Milvus 的向量存储。spring-doc.cadn.net.cn
MongoDB Atlas 向量存储 - MongoDB Atlas 的向量存储。spring-doc.cadn.net.cn
Neo4j 向量存储 - Neo4j 的向量存储。spring-doc.cadn.net.cn
OpenSearch 向量存储 - OpenSearch 的向量存储。spring-doc.cadn.net.cn
Oracle 向量存储 - Oracle 数据库的向量存储。spring-doc.cadn.net.cn
PgVector 存储 - 一个 PostgreSQL/PGVector 向量存储。spring-doc.cadn.net.cn
Pinecone向量存储 - Pinecone 向量存储。spring-doc.cadn.net.cn
Qdrant 向量存储 - Qdrant 向量存储。spring-doc.cadn.net.cn
Redis 向量存储 - Redis 向量存储。spring-doc.cadn.net.cn
SAP Hana 向量存储 - SAP HANA 向量存储。spring-doc.cadn.net.cn
Typesense 向量存储 - Typesense 的向量存储。spring-doc.cadn.net.cn
Weaviate 向量存储 - Weaviate 的向量存储。spring-doc.cadn.net.cn
S3 向量存储 - AWS S3 向量存储。spring-doc.cadn.net.cn
SimpleVectorStore - 一种简单的持久化向量存储实现，适合用于教学目的。spring-doc.cadn.net.cn

未来版本可能会支持更多实现。spring-doc.cadn.net.cn

如果您有一个需要由 Spring AI 支持的向量数据库，请在 GitHub 上提交一个问题，或者更好的是，提交一个包含实现的拉取请求。spring-doc.cadn.net.cn

关于每个VectorStore实现的信息可以在本章的子章节中找到。spring-doc.cadn.net.cn

示例用法

要为向量数据库计算嵌入，您需要选择一个与所使用的高级 AI 模型相匹配的嵌入模型。spring-doc.cadn.net.cn

例如，对于 OpenAI 的 ChatGPT，我们使用 OpenAiEmbeddingModel 和一个名为 text-embedding-ada-002 的模型。spring-doc.cadn.net.cn

Spring Boot Starter 对 OpenAI 的自动配置会在 Spring 应用程序上下文中提供一个 EmbeddingModel 实现，以便进行依赖注入。spring-doc.cadn.net.cn

写入向量存储

将数据加载到向量存储中的通用用法通常是在批处理作业中完成的，首先将数据加载到 Spring AI 的 Document 类中，然后调用 add 接口上的 VectorStore 方法。spring-doc.cadn.net.cn

给定一个指向源文件的String引用，该源文件是一个包含我们希望加载到向量数据库中的数据的JSON文件，我们使用Spring AI的JsonReader来加载JSON中的特定字段，将其拆分成小片段，然后将这些小片段传递给向量存储实现。 VectorStore实现会计算嵌入并向量数据库中存储JSON和嵌入：spring-doc.cadn.net.cn

@Autowired
VectorStore vectorStore;

void load(String sourceFile) {
    JsonReader jsonReader = new JsonReader(new FileSystemResource(sourceFile),
            "price", "name", "shortDescription", "description", "tags");
    List<Document> documents = jsonReader.get();
    this.vectorStore.add(documents);
}

从向量存储中读取

随后，当用户问题被传递到 AI 模型时，会执行一次相似性搜索以检索相似的文档，并将这些文档“填充”到提示中，作为用户问题的上下文。spring-doc.cadn.net.cn

对于只读操作，您可以使用 VectorStore 接口或更专注的 VectorStoreRetriever 接口：spring-doc.cadn.net.cn

@Autowired
VectorStoreRetriever retriever; // Could also use VectorStore here

String question = "<question from user>";
List<Document> similarDocuments = retriever.similaritySearch(question);

// Or with more specific search parameters
SearchRequest request = SearchRequest.builder()
    .query(question)
    .topK(5)                       // Return top 5 results
    .similarityThreshold(0.7)      // Only return results with similarity score >= 0.7
    .build();

List<Document> filteredDocuments = retriever.similaritySearch(request);

可以将其他选项传递到 similaritySearch 方法中，以定义要检索的文档数量以及相似性搜索的阈值。spring-doc.cadn.net.cn

读写操作的分离

使用独立的接口可以让您清晰地定义哪些组件需要写访问权限，哪些只需要读访问权限：spring-doc.cadn.net.cn

// Write operations in a service that needs full access
@Service
class DocumentIndexer {
    private final VectorStore vectorStore;

    DocumentIndexer(VectorStore vectorStore) {
        this.vectorStore = vectorStore;
    }

    public void indexDocuments(List<Document> documents) {
        vectorStore.add(documents);
    }
}

// Read-only operations in a service that only needs retrieval
@Service
class DocumentRetriever {
    private final VectorStoreRetriever retriever;

    DocumentRetriever(VectorStoreRetriever retriever) {
        this.retriever = retriever;
    }

    public List<Document> findSimilar(String query) {
        return retriever.similaritySearch(query);
    }
}

这种关注点分离通过将对变更操作的访问权限仅限于真正需要它们的组件，有助于创建更易于维护且更安全的应用程序。spring-doc.cadn.net.cn

使用VectorStoreRetriever的检索操作

The VectorStoreRetriever interface provides a read-only view of a vector store, exposing only the similarity search functionality. This follows the principle of least privilege and is particularly useful in RAG (Retrieval-Augmented Generation) applications where you only need to retrieve documents without modifying the underlying data.spring-doc.cadn.net.cn

使用VectorStoreRetriever的优势

关注点分离：清晰地将读操作与写操作分开。spring-doc.cadn.net.cn
接口隔离：仅需要检索功能的客户端不会被暴露给修改方法。spring-doc.cadn.net.cn
函数式接口：对于简单用例，可以使用 Lambda 表达式或方法引用实现。spring-doc.cadn.net.cn
减少依赖：仅需执行搜索的组件无需依赖完整的VectorStore接口。spring-doc.cadn.net.cn

示例用法

您可以直接使用 VectorStoreRetriever，当您只需要执行相似性搜索时：spring-doc.cadn.net.cn

@Service
public class DocumentRetrievalService {

    private final VectorStoreRetriever retriever;

    public DocumentRetrievalService(VectorStoreRetriever retriever) {
        this.retriever = retriever;
    }

    public List<Document> findSimilarDocuments(String query) {
        return retriever.similaritySearch(query);
    }

    public List<Document> findSimilarDocumentsWithFilters(String query, String country) {
        SearchRequest request = SearchRequest.builder()
            .query(query)
            .topK(5)
            .filterExpression("country == '" + country + "'")
            .build();

        return retriever.similaritySearch(request);
    }
}

在本示例中，该服务仅依赖于VectorStoreRetriever接口，因此可以明确其仅执行检索操作，而不会修改向量存储。spring-doc.cadn.net.cn

与RAG应用程序的集成

The VectorStoreRetriever interface is particularly useful in RAG applications, where you need to retrieve relevant documents to provide context for an AI model:spring-doc.cadn.net.cn

@Service
public class RagService {

    private final VectorStoreRetriever retriever;
    private final ChatModel chatModel;

    public RagService(VectorStoreRetriever retriever, ChatModel chatModel) {
        this.retriever = retriever;
        this.chatModel = chatModel;
    }

    public String generateResponse(String userQuery) {
        // Retrieve relevant documents
        List<Document> relevantDocs = retriever.similaritySearch(userQuery);

        // Extract content from documents to use as context
        String context = relevantDocs.stream()
            .map(Document::getContent)
            .collect(Collectors.joining("\n\n"));

        // Generate response using the retrieved context
        String prompt = "Context information:\n" + context + "\n\nUser query: " + userQuery;
        return chatModel.generate(prompt);
    }
}

这种模式使得在 RAG 应用中能够清晰地将检索组件与生成组件分离。spring-doc.cadn.net.cn

元数据过滤器

本节介绍可用于对查询结果进行各种过滤的筛选器。spring-doc.cadn.net.cn

筛选字符串

您可以将类似 SQL 的筛选表达式作为 String 传递给其中一个 similaritySearch 重载。spring-doc.cadn.net.cn

请参阅以下示例：spring-doc.cadn.net.cn

"country == 'BG'"spring-doc.cadn.net.cn
"genre == 'drama' && year >= 2020"spring-doc.cadn.net.cn
"genre in ['comedy', 'documentary', 'drama']"spring-doc.cadn.net.cn

筛选表达式

您可以使用一个暴露流畅API的FilterExpressionBuilder来创建Filter.Expression的实例。一个简单的示例如下：spring-doc.cadn.net.cn

FilterExpressionBuilder b = new FilterExpressionBuilder();
Expression expression = this.b.eq("country", "BG").build();

您可以使用以下运算符构建复杂的表达式：spring-doc.cadn.net.cn

EQUALS: '=='
MINUS : '-'
PLUS: '+'
GT: '>'
GE: '>='
LT: '<'
LE: '<='
NE: '!='

您可以使用以下运算符来组合表达式：spring-doc.cadn.net.cn

AND: 'AND' | 'and' | '&&';
OR: 'OR' | 'or' | '||';

考虑到以下示例：spring-doc.cadn.net.cn

Expression exp = b.and(b.eq("genre", "drama"), b.gte("year", 2020)).build();

您还可以使用以下运算符：spring-doc.cadn.net.cn

IN: 'IN' | 'in';
NIN: 'NIN' | 'nin';
NOT: 'NOT' | 'not';

请考虑以下示例：spring-doc.cadn.net.cn

Expression exp = b.and(b.in("genre", "drama", "documentary"), b.not(b.lt("year", 2020))).build();

您还可以使用以下运算符：spring-doc.cadn.net.cn

IS: 'IS' | 'is';
NULL: 'NULL' | 'null';
NOT NULL: 'NOT NULL' | 'not null';

请考虑以下示例：spring-doc.cadn.net.cn

Expression exp = b.and(b.isNull("year")).build();
Expression exp = b.and(b.isNotNull("year")).build();

IS NULL 和 IS NOT NULL 尚未在所有向量存储中实现。

从向量存储中删除文档

向量存储接口提供了多种删除文档的方法，允许你通过特定的文档ID或使用过滤表达式来移除数据。spring-doc.cadn.net.cn

通过文档ID删除

删除文档最简单的方法是提供一个文档 ID 列表：spring-doc.cadn.net.cn

void delete(List<String> idList);

此方法会删除所有ID在所提供列表中的文档。如果列表中的任何ID在存储中不存在，则该ID将被忽略。spring-doc.cadn.net.cn

示例用法

// Create and add document
Document document = new Document("The World is Big",
    Map.of("country", "Netherlands"));
vectorStore.add(List.of(document));

// Delete document by ID
vectorStore.delete(List.of(document.getId()));

根据筛选条件删除

对于更复杂的删除条件，可以使用过滤表达式：spring-doc.cadn.net.cn

void delete(Filter.Expression filterExpression);

此方法接受一个 Filter.Expression 对象，该对象定义了应删除哪些文档的条件。当需要根据文档的元数据属性删除文档时，此方法特别有用。spring-doc.cadn.net.cn

示例用法

// Create test documents with different metadata
Document bgDocument = new Document("The World is Big",
    Map.of("country", "Bulgaria"));
Document nlDocument = new Document("The World is Big",
    Map.of("country", "Netherlands"));

// Add documents to the store
vectorStore.add(List.of(bgDocument, nlDocument));

// Delete documents from Bulgaria using filter expression
Filter.Expression filterExpression = new Filter.Expression(
    Filter.ExpressionType.EQ,
    new Filter.Key("country"),
    new Filter.Value("Bulgaria")
);
vectorStore.delete(filterExpression);

// Verify deletion with search
SearchRequest request = SearchRequest.builder()
    .query("World")
    .filterExpression("country == 'Bulgaria'")
    .build();
List<Document> results = vectorStore.similaritySearch(request);
// results will be empty as Bulgarian document was deleted

通过字符串筛选表达式删除

为方便起见，您也可以使用基于字符串的过滤表达式来删除文档：spring-doc.cadn.net.cn

void delete(String filterExpression);

此方法在内部将提供的字符串过滤器转换为 Filter.Expression 对象。当你以字符串格式拥有过滤条件时，此方法非常有用。spring-doc.cadn.net.cn

示例用法

// Create and add documents
Document bgDocument = new Document("The World is Big",
    Map.of("country", "Bulgaria"));
Document nlDocument = new Document("The World is Big",
    Map.of("country", "Netherlands"));
vectorStore.add(List.of(bgDocument, nlDocument));

// Delete Bulgarian documents using string filter
vectorStore.delete("country == 'Bulgaria'");

// Verify remaining documents
SearchRequest request = SearchRequest.builder()
    .query("World")
    .topK(5)
    .build();
List<Document> results = vectorStore.similaritySearch(request);
// results will only contain the Netherlands document

调用删除API时的错误处理

所有删除方法在出现错误时可能会抛出异常：spring-doc.cadn.net.cn

最佳实践是将删除操作封装在 try-catch 代码块中：spring-doc.cadn.net.cn

示例用法

try {
    vectorStore.delete("country == 'Bulgaria'");
}
catch (Exception  e) {
    logger.error("Invalid filter expression", e);
}

文档版本控制用例

一个常见的场景是管理文档版本，你需要上传文档的新版本，同时删除旧版本。以下是使用过滤表达式来处理此问题的方法：spring-doc.cadn.net.cn

示例用法

// Create initial document (v1) with version metadata
Document documentV1 = new Document(
    "AI and Machine Learning Best Practices",
    Map.of(
        "docId", "AIML-001",
        "version", "1.0",
        "lastUpdated", "2024-01-01"
    )
);

// Add v1 to the vector store
vectorStore.add(List.of(documentV1));

// Create updated version (v2) of the same document
Document documentV2 = new Document(
    "AI and Machine Learning Best Practices - Updated",
    Map.of(
        "docId", "AIML-001",
        "version", "2.0",
        "lastUpdated", "2024-02-01"
    )
);

// First, delete the old version using filter expression
Filter.Expression deleteOldVersion = new Filter.Expression(
    Filter.ExpressionType.AND,
    new Filter.Expression(
        Filter.ExpressionType.EQ,
        new Filter.Key("docId"),
        new Filter.Value("AIML-001")
    ),
    new Filter.Expression(
        Filter.ExpressionType.EQ,
        new Filter.Key("version"),
        new Filter.Value("1.0")
    )
);
vectorStore.delete(deleteOldVersion);

// Add the new version
vectorStore.add(List.of(documentV2));

// Verify only v2 exists
SearchRequest request = SearchRequest.builder()
    .query("AI and Machine Learning")
    .filterExpression("docId == 'AIML-001'")
    .build();
List<Document> results = vectorStore.similaritySearch(request);
// results will contain only v2 of the document

你也可以使用字符串过滤表达式来实现相同的效果：spring-doc.cadn.net.cn

示例用法

// Delete old version using string filter
vectorStore.delete("docId == 'AIML-001' AND version == '1.0'");

// Add new version
vectorStore.add(List.of(documentV2));

删除文档时的性能考虑

当您确切知道需要删除哪些文档时，按 ID 列表删除通常速度更快。spring-doc.cadn.net.cn
基于过滤器的删除可能需要扫描索引来查找匹配的文档；但是，这取决于具体的向量存储实现。spring-doc.cadn.net.cn
大规模删除操作应进行分批处理，以避免系统过载。spring-doc.cadn.net.cn
在根据文档属性进行删除时，考虑使用过滤表达式，而不是先收集ID。spring-doc.cadn.net.cn

了解向量

理解向量 spring-doc.cadn.net.cn