This version is still in development and is not considered stable yet. For the latest snapshot version, please use Spring AI 1.1.3!spring-doc.cn

Anthropic Chat

Spring AI supports Anthropic’s Claude models through the official Anthropic Java SDK, providing access to Claude through Anthropic’s API.spring-doc.cn

Prerequisites

Create an account at the Anthropic Console and generate an API key on the API Keys page.spring-doc.cn

Add Repositories and BOM

Spring AI artifacts are published in Maven Central and Spring Snapshot repositories. Refer to the Artifact Repositories section to add these repositories to your build system.spring-doc.cn

To help with dependency management, Spring AI provides a BOM (bill of materials) to ensure that a consistent version of Spring AI is used throughout the entire project. Refer to the Dependency Management section to add the Spring AI BOM to your build system.spring-doc.cn

Auto-Configuration

Spring Boot auto-configuration is available via the spring-ai-starter-model-anthropic starter.spring-doc.cn

Add it to your project’s Maven pom.xml file:spring-doc.cn

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-anthropic</artifactId>
</dependency>

or to your Gradle build.gradle build file:spring-doc.cn

dependencies {
    implementation 'org.springframework.ai:spring-ai-starter-model-anthropic'
}
Refer to the Dependency Management section to add the Spring AI BOM to your build file.

Configuration Properties

Use the spring.ai.anthropic.* properties to configure the Anthropic connection and chat options:spring-doc.cn

Property Description Default

spring.ai.anthropic.api-keyspring-doc.cn

Anthropic API keyspring-doc.cn

-spring-doc.cn

spring.ai.anthropic.base-urlspring-doc.cn

API base URLspring-doc.cn

api.anthropic.comspring-doc.cn

spring.ai.anthropic.chat.options.modelspring-doc.cn

Model namespring-doc.cn

claude-haiku-4-5spring-doc.cn

spring.ai.anthropic.chat.options.max-tokensspring-doc.cn

Maximum tokensspring-doc.cn

4096spring-doc.cn

spring.ai.anthropic.chat.options.temperaturespring-doc.cn

Sampling temperaturespring-doc.cn

-spring-doc.cn

spring.ai.anthropic.chat.options.top-pspring-doc.cn

Top-p samplingspring-doc.cn

-spring-doc.cn

spring.ai.anthropic.chat.options.top-kspring-doc.cn

Top-k samplingspring-doc.cn

-spring-doc.cn

Manual Configuration

The AnthropicChatModel implements the ChatModel interface and uses the official Anthropic Java SDK to connect to Claude.spring-doc.cn

Add the spring-ai-anthropic dependency to your project’s Maven pom.xml file:spring-doc.cn

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-anthropic</artifactId>
</dependency>

or to your Gradle build.gradle build file:spring-doc.cn

dependencies {
    implementation 'org.springframework.ai:spring-ai-anthropic'
}
Refer to the Dependency Management section to add the Spring AI BOM to your build file.

Authentication

Configure your API key either programmatically or via environment variable:spring-doc.cn

var chatOptions = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-20250514")
    .maxTokens(1024)
    .apiKey(System.getenv("ANTHROPIC_API_KEY"))
    .build();

var chatModel = new AnthropicChatModel(chatOptions);

Or set the environment variable and let the SDK auto-detect it:spring-doc.cn

export ANTHROPIC_API_KEY=<your-api-key>
// API key will be detected from ANTHROPIC_API_KEY environment variable
var chatModel = new AnthropicChatModel(
    AnthropicChatOptions.builder()
        .model("claude-sonnet-4-20250514")
        .maxTokens(1024)
        .build());

Basic Usage

ChatResponse response = chatModel.call(
    new Prompt("Generate the names of 5 famous pirates."));

// Or with streaming responses
Flux<ChatResponse> stream = chatModel.stream(
    new Prompt("Generate the names of 5 famous pirates."));

Runtime Options

The AnthropicChatOptions.java class provides model configurations such as the model to use, temperature, max tokens, etc.spring-doc.cn

On start-up, configure default options with the AnthropicChatModel(options) constructor.spring-doc.cn

At run-time, you can override the default options by adding new, request-specific options to the Prompt call. For example, to override the default model and temperature for a specific request:spring-doc.cn

ChatResponse response = chatModel.call(
    new Prompt(
        "Generate the names of 5 famous pirates.",
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4-20250514")
            .temperature(0.4)
        .build()
    ));

Chat Options

Option Description Default

modelspring-doc.cn

Name of the Claude model to use. Models include: claude-sonnet-4-20250514, claude-opus-4-20250514, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022, etc. See Claude Models.spring-doc.cn

claude-sonnet-4-20250514spring-doc.cn

maxTokensspring-doc.cn

The maximum number of tokens to generate in the response.spring-doc.cn

4096spring-doc.cn

temperaturespring-doc.cn

Controls randomness in the response. Higher values make output more random, lower values make it more deterministic. Range: 0.0-1.0spring-doc.cn

1.0spring-doc.cn

topPspring-doc.cn

Nucleus sampling parameter. The model considers tokens with top_p probability mass.spring-doc.cn

-spring-doc.cn

topKspring-doc.cn

Only sample from the top K options for each token.spring-doc.cn

-spring-doc.cn

stopSequencesspring-doc.cn

Custom sequences that will cause the model to stop generating.spring-doc.cn

-spring-doc.cn

apiKeyspring-doc.cn

The API key for authentication. Auto-detects from ANTHROPIC_API_KEY environment variable if not set.spring-doc.cn

-spring-doc.cn

baseUrlspring-doc.cn

The base URL for the Anthropic API.spring-doc.cn

api.anthropic.comspring-doc.cn

timeoutspring-doc.cn

Request timeout duration.spring-doc.cn

60 secondsspring-doc.cn

maxRetriesspring-doc.cn

Maximum number of retry attempts for failed requests.spring-doc.cn

2spring-doc.cn

proxyspring-doc.cn

Proxy settings for the HTTP client.spring-doc.cn

-spring-doc.cn

customHeadersspring-doc.cn

Custom HTTP headers to include on all requests (client-level).spring-doc.cn

-spring-doc.cn

httpHeadersspring-doc.cn

Per-request HTTP headers. These are added to individual API calls via MessageCreateParams.putAdditionalHeader(). Useful for request-level tracking, beta API headers, or routing.spring-doc.cn

-spring-doc.cn

thinkingspring-doc.cn

Thinking configuration. Use the convenience builders thinkingEnabled(budgetTokens), thinkingAdaptive(), or thinkingDisabled(), or pass a raw ThinkingConfigParam.spring-doc.cn

-spring-doc.cn

outputConfigspring-doc.cn

Output configuration for structured output (JSON schema) and effort control. Use outputConfig(OutputConfig) for full control, or the convenience methods outputSchema(String) and effort(OutputConfig.Effort). Requires claude-sonnet-4-6 or newer.spring-doc.cn

-spring-doc.cn

Tool Calling Options

Option Description Default

toolChoicespring-doc.cn

Controls which tool (if any) is called by the model. Use ToolChoiceAuto, ToolChoiceAny, ToolChoiceTool, or ToolChoiceNone.spring-doc.cn

AUTOspring-doc.cn

toolCallbacksspring-doc.cn

List of tool callbacks to register with the model.spring-doc.cn

-spring-doc.cn

toolNamesspring-doc.cn

Set of tool names to be resolved at runtime.spring-doc.cn

-spring-doc.cn

internalToolExecutionEnabledspring-doc.cn

If false, tool calls are proxied to the client for manual handling. If true, Spring AI handles tool calls internally.spring-doc.cn

truespring-doc.cn

disableParallelToolUsespring-doc.cn

When true, the model will use at most one tool per response.spring-doc.cn

falsespring-doc.cn

In addition to the model-specific AnthropicChatOptions, you can use a portable ChatOptions instance, created with ChatOptions#builder().

Tool Calling

You can register custom Java functions or methods with the AnthropicChatModel and have Claude intelligently choose to output a JSON object containing arguments to call one or many of the registered functions/tools. This is a powerful technique to connect the LLM capabilities with external tools and APIs. Read more about Tool Calling.spring-doc.cn

Basic Tool Calling

var chatOptions = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-20250514")
    .toolCallbacks(List.of(
        FunctionToolCallback.builder("getCurrentWeather", new WeatherService())
            .description("Get the weather in location")
            .inputType(WeatherService.Request.class)
            .build()))
    .build();

var chatModel = new AnthropicChatModel(chatOptions);

ChatResponse response = chatModel.call(
    new Prompt("What's the weather like in San Francisco?", chatOptions));

Tool Choice Options

Control how Claude uses tools with the toolChoice option:spring-doc.cn

import com.anthropic.models.messages.ToolChoiceAny;
import com.anthropic.models.messages.ToolChoiceTool;
import com.anthropic.models.messages.ToolChoiceNone;

// Force Claude to use any available tool
var options = AnthropicChatOptions.builder()
    .toolChoice(ToolChoiceAny.builder().build())
    .toolCallbacks(...)
    .build();

// Force Claude to use a specific tool
var options = AnthropicChatOptions.builder()
    .toolChoice(ToolChoiceTool.builder().name("getCurrentWeather").build())
    .toolCallbacks(...)
    .build();

// Prevent tool use entirely
var options = AnthropicChatOptions.builder()
    .toolChoice(ToolChoiceNone.builder().build())
    .toolCallbacks(...)
    .build();

The Anthropic Java SDK provides convenient static factory methods for common tool choices, which can make your code more concise:spring-doc.cn

  • ToolChoice.auto() can be used instead of ToolChoice.ofAuto(…​).spring-doc.cn

  • ToolChoice.any() can be used instead of ToolChoice.ofAny(…​).spring-doc.cn

  • ToolChoice.none() can be used instead of ToolChoice.ofNone(…​).spring-doc.cn

Streaming Tool Calling

The Anthropic SDK module fully supports tool calling in streaming mode. When Claude decides to call a tool during streaming:spring-doc.cn

  1. Tool call arguments are accumulated from partial JSON deltasspring-doc.cn

  2. Tools are executed when the content block completesspring-doc.cn

  3. Results are sent back to Claudespring-doc.cn

  4. The conversation continues recursively until Claude provides a final responsespring-doc.cn

Flux<ChatResponse> stream = chatModel.stream(
    new Prompt("What's the weather in Paris, Tokyo, and New York?", chatOptions));

String response = stream
    .collectList()
    .block()
    .stream()
    .map(r -> r.getResult().getOutput().getContent())
    .filter(Objects::nonNull)
    .collect(Collectors.joining());

Streaming

The Anthropic SDK module supports both synchronous and streaming responses. Streaming allows Claude to return responses incrementally as they’re generated.spring-doc.cn

Flux<ChatResponse> stream = chatModel.stream(new Prompt("Tell me a story"));

stream.subscribe(response -> {
    String content = response.getResult().getOutput().getContent();
    if (content != null) {
        System.out.print(content);
    }
});

Extended Thinking

Anthropic Claude models support a "thinking" feature that allows the model to show its reasoning process before providing a final answer. This is especially useful for complex questions that require step-by-step reasoning, such as math, logic, and analysis tasks.spring-doc.cn

Supported Modelsspring-doc.cn

The thinking feature is supported by the following Claude models:spring-doc.cn

  • Claude 4 models (claude-opus-4-20250514, claude-sonnet-4-20250514)spring-doc.cn

  • Claude 3.7 Sonnet (claude-3-7-sonnet-20250219)spring-doc.cn

Model capabilities:spring-doc.cn

  • Claude 3.7 Sonnet: Returns full thinking output.spring-doc.cn

  • Claude 4 models: Support summarized thinking and enhanced tool integration.spring-doc.cn

API request structure is the same across all supported models, but output behavior varies.spring-doc.cn

Thinking Configuration

To enable thinking, configure the following:spring-doc.cn

  1. Set a thinking budget: The budgetTokens must be >= 1024 and less than maxTokens.spring-doc.cn

  2. Set temperature to 1.0: Required when thinking is enabled.spring-doc.cn

Convenience Builder Methods

AnthropicChatOptions.Builder provides convenience methods for the three thinking modes:spring-doc.cn

// Enable thinking with a specific token budget
var options = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-20250514")
    .temperature(1.0)
    .maxTokens(16000)
    .thinkingEnabled(10000L)    // budget must be >= 1024 and < maxTokens
    .build();

// Let Claude adaptively decide whether to think
var options = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-20250514")
    .thinkingAdaptive()
    .build();

// Explicitly disable thinking
var options = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-20250514")
    .thinkingDisabled()
    .build();

You can also use the raw SDK ThinkingConfigParam directly:spring-doc.cn

import com.anthropic.models.messages.ThinkingConfigParam;
import com.anthropic.models.messages.ThinkingConfigEnabled;

var options = AnthropicChatOptions.builder()
    .thinking(ThinkingConfigParam.ofEnabled(
        ThinkingConfigEnabled.builder().budgetTokens(10000L).build()))
    .build();

Non-streaming Example

var options = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-20250514")
    .temperature(1.0)
    .maxTokens(16000)
    .thinkingEnabled(10000L)
    .build();

ChatResponse response = chatModel.call(
    new Prompt("Are there an infinite number of prime numbers such that n mod 4 == 3?", options));

// The response contains multiple generations:
// - ThinkingBlock generations (with "signature" in metadata)
// - TextBlock generations (with the final answer)
for (Generation generation : response.getResults()) {
    AssistantMessage message = generation.getOutput();
    if (message.getMetadata().containsKey("signature")) {
        // This is a thinking block - contains Claude's reasoning
        System.out.println("Thinking: " + message.getText());
        System.out.println("Signature: " + message.getMetadata().get("signature"));
    }
    else if (message.getMetadata().containsKey("data")) {
        // This is a redacted thinking block (safety-redacted reasoning)
        System.out.println("Redacted thinking data: " + message.getMetadata().get("data"));
    }
    else if (message.getText() != null && !message.getText().isBlank()) {
        // This is the final text response
        System.out.println("Answer: " + message.getText());
    }
}

Streaming Example

Thinking is fully supported in streaming mode. Thinking deltas and signature deltas are emitted as they arrive:spring-doc.cn

var options = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-20250514")
    .temperature(1.0)
    .maxTokens(16000)
    .thinkingEnabled(10000L)
    .build();

Flux<ChatResponse> stream = chatModel.stream(
    new Prompt("Are there an infinite number of prime numbers such that n mod 4 == 3?", options));

stream.subscribe(response -> {
    Generation generation = response.getResult();
    AssistantMessage message = generation.getOutput();

    if (message.getMetadata().containsKey("thinking")) {
        // Incremental thinking content
        System.out.print(message.getText());
    }
    else if (message.getMetadata().containsKey("signature")) {
        // Thinking block signature (emitted at end of thinking)
        System.out.println("\nSignature: " + message.getMetadata().get("signature"));
    }
    else if (message.getText() != null) {
        // Final text content
        System.out.print(message.getText());
    }
});

Response Structure

When thinking is enabled, the response contains different types of content:spring-doc.cn

Content Type Metadata Key Description

Thinking Blockspring-doc.cn

signaturespring-doc.cn

Claude’s reasoning text with a cryptographic signature. In sync mode, the thinking text is in getText() and the signature is in getMetadata().get("signature").spring-doc.cn

Redacted Thinkingspring-doc.cn

dataspring-doc.cn

Safety-redacted reasoning. Contains only a data marker, no visible text.spring-doc.cn

Signature (streaming)spring-doc.cn

signaturespring-doc.cn

In streaming mode, the signature arrives as a separate delta at the end of a thinking block.spring-doc.cn

Thinking Delta (streaming)spring-doc.cn

thinkingspring-doc.cn

Incremental thinking text chunks during streaming. The thinking metadata key is set to true.spring-doc.cn

Text Blockspring-doc.cn

(none)spring-doc.cn

The final answer text in getText().spring-doc.cn

Multi-Modal Support

The Anthropic SDK module supports multi-modal inputs, allowing you to send images and PDF documents alongside text in your prompts.spring-doc.cn

Image Input

Send images to Claude for analysis using the Media class:spring-doc.cn

var imageResource = new ClassPathResource("/test-image.png");

var userMessage = UserMessage.builder()
    .text("What do you see in this image?")
    .media(List.of(new Media(MimeTypeUtils.IMAGE_PNG, imageResource)))
    .build();

ChatResponse response = chatModel.call(new Prompt(List.of(userMessage)));

Supported image formats: PNG, JPEG, GIF, WebP. Images can be provided as:spring-doc.cn

PDF Document Input

Send PDF documents for Claude to analyze:spring-doc.cn

var pdfResource = new ClassPathResource("/document.pdf");

var userMessage = UserMessage.builder()
    .text("Please summarize this document.")
    .media(List.of(new Media(new MimeType("application", "pdf"), pdfResource)))
    .build();

ChatResponse response = chatModel.call(new Prompt(List.of(userMessage)));

Multiple Media Items

You can include multiple images or documents in a single message:spring-doc.cn

var userMessage = UserMessage.builder()
    .text("Compare these two images.")
    .media(List.of(
        new Media(MimeTypeUtils.IMAGE_PNG, image1Resource),
        new Media(MimeTypeUtils.IMAGE_PNG, image2Resource)))
    .build();

Citations

Anthropic’s Citations API allows Claude to reference specific parts of provided documents when generating responses. When citation documents are included in a prompt, Claude can cite the source material, and citation metadata (character ranges, page numbers, or content blocks) is returned in the response metadata.spring-doc.cn

Citations help improve:spring-doc.cn

  • Accuracy verification: Users can verify Claude’s responses against source materialspring-doc.cn

  • Transparency: See exactly which parts of documents informed the responsespring-doc.cn

  • Compliance: Meet requirements for source attribution in regulated industriesspring-doc.cn

  • Trust: Build confidence by showing where information came fromspring-doc.cn

Supported Modelsspring-doc.cn

Citations are supported on Claude 3.7 Sonnet and Claude 4 models (Opus and Sonnet).spring-doc.cn

Document Typesspring-doc.cn

Three types of citation documents are supported:spring-doc.cn

  • Plain Text: Text content with character-level citationsspring-doc.cn

  • PDF: PDF documents with page-level citationsspring-doc.cn

  • Custom Content: User-defined content blocks with block-level citationsspring-doc.cn

Creating Citation Documents

Use the AnthropicCitationDocument builder to create documents that can be cited:spring-doc.cn

Plain Text Documents

AnthropicCitationDocument document = AnthropicCitationDocument.builder()
    .plainText("The Eiffel Tower was completed in 1889 in Paris, France. " +
               "It stands 330 meters tall and was designed by Gustave Eiffel.")
    .title("Eiffel Tower Facts")
    .citationsEnabled(true)
    .build();

PDF Documents

// From file path
AnthropicCitationDocument document = AnthropicCitationDocument.builder()
    .pdfFile("path/to/document.pdf")
    .title("Technical Specification")
    .citationsEnabled(true)
    .build();

// From byte array
byte[] pdfBytes = loadPdfBytes();
AnthropicCitationDocument document = AnthropicCitationDocument.builder()
    .pdf(pdfBytes)
    .title("Product Manual")
    .citationsEnabled(true)
    .build();

Custom Content Blocks

For fine-grained citation control, use custom content blocks:spring-doc.cn

AnthropicCitationDocument document = AnthropicCitationDocument.builder()
    .customContent(
        "The Great Wall of China is approximately 21,196 kilometers long.",
        "It was built over many centuries, starting in the 7th century BC.",
        "The wall was constructed to protect Chinese states from invasions."
    )
    .title("Great Wall Facts")
    .citationsEnabled(true)
    .build();

Using Citations in Requests

Include citation documents in your chat options:spring-doc.cn

ChatResponse response = chatModel.call(
    new Prompt(
        "When was the Eiffel Tower built and how tall is it?",
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4-20250514")
            .maxTokens(1024)
            .citationDocuments(document)
            .build()
    )
);

Multiple Documents

You can provide multiple documents for Claude to reference:spring-doc.cn

AnthropicCitationDocument parisDoc = AnthropicCitationDocument.builder()
    .plainText("Paris is the capital city of France with a population of 2.1 million.")
    .title("Paris Information")
    .citationsEnabled(true)
    .build();

AnthropicCitationDocument eiffelDoc = AnthropicCitationDocument.builder()
    .plainText("The Eiffel Tower was designed by Gustave Eiffel for the 1889 World's Fair.")
    .title("Eiffel Tower History")
    .citationsEnabled(true)
    .build();

ChatResponse response = chatModel.call(
    new Prompt(
        "What is the capital of France and who designed the Eiffel Tower?",
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4-20250514")
            .citationDocuments(parisDoc, eiffelDoc)
            .build()
    )
);

Accessing Citations

Citations are returned in the response metadata:spring-doc.cn

ChatResponse response = chatModel.call(prompt);

// Get citations from metadata
List<Citation> citations = (List<Citation>) response.getMetadata().get("citations");

// Optional: Get citation count directly from metadata
Integer citationCount = (Integer) response.getMetadata().get("citationCount");
System.out.println("Total citations: " + citationCount);

// Process each citation
for (Citation citation : citations) {
    System.out.println("Document: " + citation.getDocumentTitle());
    System.out.println("Location: " + citation.getLocationDescription());
    System.out.println("Cited text: " + citation.getCitedText());
    System.out.println("Document index: " + citation.getDocumentIndex());
    System.out.println();
}

Citation Types

Citations contain different location information depending on the document type:spring-doc.cn

Character Location (Plain Text)

For plain text documents, citations include character indices:spring-doc.cn

Citation citation = citations.get(0);
if (citation.getType() == Citation.LocationType.CHAR_LOCATION) {
    int start = citation.getStartCharIndex();
    int end = citation.getEndCharIndex();
    String text = citation.getCitedText();
    System.out.println("Characters " + start + "-" + end + ": " + text);
}

Page Location (PDF)

For PDF documents, citations include page numbers:spring-doc.cn

Citation citation = citations.get(0);
if (citation.getType() == Citation.LocationType.PAGE_LOCATION) {
    int startPage = citation.getStartPageNumber();
    int endPage = citation.getEndPageNumber();
    System.out.println("Pages " + startPage + "-" + endPage);
}

Content Block Location (Custom Content)

For custom content, citations reference specific content blocks:spring-doc.cn

Citation citation = citations.get(0);
if (citation.getType() == Citation.LocationType.CONTENT_BLOCK_LOCATION) {
    int startBlock = citation.getStartBlockIndex();
    int endBlock = citation.getEndBlockIndex();
    System.out.println("Content blocks " + startBlock + "-" + endBlock);
}

Complete Example

Here’s a complete example demonstrating citation usage:spring-doc.cn

// Create a citation document
AnthropicCitationDocument document = AnthropicCitationDocument.builder()
    .plainText("Spring AI is an application framework for AI engineering. " +
               "It provides a Spring-friendly API for developing AI applications. " +
               "The framework includes abstractions for chat models, embedding models, " +
               "and vector databases.")
    .title("Spring AI Overview")
    .citationsEnabled(true)
    .build();

// Call the model with the document
ChatResponse response = chatModel.call(
    new Prompt(
        "What is Spring AI?",
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4-20250514")
            .maxTokens(1024)
            .citationDocuments(document)
            .build()
    )
);

// Display the response
System.out.println("Response: " + response.getResult().getOutput().getText());
System.out.println("\nCitations:");

// Process citations
List<Citation> citations = (List<Citation>) response.getMetadata().get("citations");

if (citations != null && !citations.isEmpty()) {
    for (int i = 0; i < citations.size(); i++) {
        Citation citation = citations.get(i);
        System.out.println("\n[" + (i + 1) + "] " + citation.getDocumentTitle());
        System.out.println("    Location: " + citation.getLocationDescription());
        System.out.println("    Text: " + citation.getCitedText());
    }
} else {
    System.out.println("No citations were provided in the response.");
}

Best Practices

  1. Use descriptive titles: Provide meaningful titles for citation documents to help users identify sources in the citations.spring-doc.cn

  2. Check for null citations: Not all responses will include citations, so always validate the citations metadata exists before accessing it.spring-doc.cn

  3. Consider document size: Larger documents provide more context but consume more input tokens and may affect response time.spring-doc.cn

  4. Leverage multiple documents: When answering questions that span multiple sources, provide all relevant documents in a single request rather than making multiple calls.spring-doc.cn

  5. Use appropriate document types: Choose plain text for simple content, PDF for existing documents, and custom content blocks when you need fine-grained control over citation granularity.spring-doc.cn

Citation Document Options

Context Field

Optionally provide context about the document that won’t be cited but can guide Claude’s understanding:spring-doc.cn

AnthropicCitationDocument document = AnthropicCitationDocument.builder()
    .plainText("...")
    .title("Legal Contract")
    .context("This is a merger agreement dated January 2024 between Company A and Company B")
    .build();

Controlling Citations

By default, citations are disabled for all documents (opt-in behavior). To enable citations, explicitly set citationsEnabled(true):spring-doc.cn

AnthropicCitationDocument document = AnthropicCitationDocument.builder()
    .plainText("The Eiffel Tower was completed in 1889...")
    .title("Historical Facts")
    .citationsEnabled(true)  // Explicitly enable citations for this document
    .build();

You can also provide documents without citations for background context:spring-doc.cn

AnthropicCitationDocument backgroundDoc = AnthropicCitationDocument.builder()
    .plainText("Background information about the industry...")
    .title("Context Document")
    // citationsEnabled defaults to false - Claude will use this but not cite it
    .build();

Anthropic requires consistent citation settings across all documents in a request. You cannot mix citation-enabled and citation-disabled documents in the same request.spring-doc.cn

Prompt Caching

Anthropic’s Prompt Caching reduces costs and latency by caching repeated context across API calls. The Anthropic SDK module supports prompt caching with configurable strategies, TTL, and per-message-type settings.spring-doc.cn

Caching Strategies

Five caching strategies are available via AnthropicCacheStrategy:spring-doc.cn

Strategy Description

NONEspring-doc.cn

No caching (default). No cache control headers are added.spring-doc.cn

SYSTEM_ONLYspring-doc.cn

Cache system message content. Uses 1 cache breakpoint.spring-doc.cn

TOOLS_ONLYspring-doc.cn

Cache tool definitions only. Uses 1 cache breakpoint.spring-doc.cn

SYSTEM_AND_TOOLSspring-doc.cn

Cache both system messages and tool definitions. Uses 2 cache breakpoints.spring-doc.cn

CONVERSATION_HISTORYspring-doc.cn

Cache system messages, tool definitions, and conversation messages. Uses up to 4 cache breakpoints.spring-doc.cn

Anthropic allows a maximum of 4 cache breakpoints per request. The implementation tracks breakpoint usage and stops adding cache control once the limit is reached.

Basic Usage

var options = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-20250514")
    .maxTokens(1024)
    .cacheOptions(AnthropicCacheOptions.builder()
        .strategy(AnthropicCacheStrategy.SYSTEM_ONLY)
        .build())
    .build();

ChatResponse response = chatModel.call(
    new Prompt(List.of(
        new SystemMessage("You are an expert assistant with deep domain knowledge..."),
        new UserMessage("What is the capital of France?")),
        options));

Cache Configuration Options

AnthropicCacheOptions provides fine-grained control over caching behavior:spring-doc.cn

var cacheOptions = AnthropicCacheOptions.builder()
    .strategy(AnthropicCacheStrategy.SYSTEM_AND_TOOLS)
    .messageTypeTtl(MessageType.SYSTEM, AnthropicCacheTtl.ONE_HOUR)     // 1 hour TTL
    .messageTypeMinContentLength(MessageType.SYSTEM, 100)                   // Min 100 chars
    .multiBlockSystemCaching(true)                                          // Per-block caching
    .build();
Option Description Default

strategyspring-doc.cn

The caching strategy to use.spring-doc.cn

NONEspring-doc.cn

messageTypeTtlspring-doc.cn

TTL per message type. Available values: FIVE_MINUTES, ONE_HOUR.spring-doc.cn

FIVE_MINUTES for all typesspring-doc.cn

messageTypeMinContentLengthspring-doc.cn

Minimum content length required before caching a message type.spring-doc.cn

1spring-doc.cn

contentLengthFunctionspring-doc.cn

Custom function to compute content length (e.g., token counting).spring-doc.cn

String::lengthspring-doc.cn

multiBlockSystemCachingspring-doc.cn

When true, each system message becomes a separate cacheable block; cache control is applied to the second-to-last block (static prefix pattern). When false, all system messages are joined into one block.spring-doc.cn

falsespring-doc.cn

Multi-Block System Caching

When you have both a static system prompt and dynamic instructions, use multi-block system caching to cache only the static portion:spring-doc.cn

var cacheOptions = AnthropicCacheOptions.builder()
    .strategy(AnthropicCacheStrategy.SYSTEM_ONLY)
    .multiBlockSystemCaching(true)
    .build();

ChatResponse response = chatModel.call(
    new Prompt(List.of(
        new SystemMessage("You are an expert knowledge base assistant..."),  // Static (cached)
        new SystemMessage("Today's date is 2025-02-23. User timezone: PST"), // Dynamic
        new UserMessage("What are the latest updates?")),
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4-20250514")
            .cacheOptions(cacheOptions)
            .build()));

Accessing Cache Token Usage

Cache token metrics are available through the native SDK Usage object:spring-doc.cn

ChatResponse response = chatModel.call(prompt);

com.anthropic.models.messages.Usage sdkUsage =
    (com.anthropic.models.messages.Usage) response.getMetadata().getUsage().getNativeUsage();
long cacheCreation = sdkUsage.cacheCreationInputTokens().orElse(0L);
long cacheRead = sdkUsage.cacheReadInputTokens().orElse(0L);

System.out.println("Cache creation tokens: " + cacheCreation);
System.out.println("Cache read tokens: " + cacheRead);

On the first request, cacheCreationInputTokens will be non-zero (tokens written to cache). On subsequent requests with the same cached prefix, cacheReadInputTokens will be non-zero (tokens read from cache at reduced cost).spring-doc.cn

Conversation History Caching

The CONVERSATION_HISTORY strategy caches the entire conversation context, including system messages, tool definitions, and the last user message. This is useful for multi-turn conversations where the growing context would otherwise be re-processed on every request:spring-doc.cn

var cacheOptions = AnthropicCacheOptions.builder()
    .strategy(AnthropicCacheStrategy.CONVERSATION_HISTORY)
    .build();

var options = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-20250514")
    .cacheOptions(cacheOptions)
    .build();

// First turn
ChatResponse response1 = chatModel.call(
    new Prompt(List.of(
        new SystemMessage("You are a helpful assistant."),
        new UserMessage("What is machine learning?")),
        options));

// Second turn - previous context is cached
ChatResponse response2 = chatModel.call(
    new Prompt(List.of(
        new SystemMessage("You are a helpful assistant."),
        new UserMessage("What is machine learning?"),
        new AssistantMessage(response1.getResult().getOutput().getText()),
        new UserMessage("Can you give me an example?")),
        options));

Structured Output

Structured output constrains Claude to produce responses conforming to a JSON schema. The Anthropic SDK module also supports Anthropic’s effort control for tuning response quality vs speed.spring-doc.cn

Model Requirementspring-doc.cn

Structured output and effort control require claude-sonnet-4-6 or newer. Older models like claude-sonnet-4-20250514 do not support these features.spring-doc.cn

Schema Requirementsspring-doc.cn

When using JSON schema output, Anthropic requires "additionalProperties": false for all object types in the schema.spring-doc.cn

JSON Schema Output

Constrain Claude’s responses to a specific JSON schema using the outputSchema convenience method:spring-doc.cn

var options = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-6")
    .outputSchema("""
        {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "capital": {"type": "string"},
                "population": {"type": "integer"}
            },
            "required": ["name", "capital"],
            "additionalProperties": false
        }
        """)
    .build();

ChatResponse response = chatModel.call(new Prompt("Tell me about France.", options));
// Response text will be valid JSON conforming to the schema

Effort Control

Control how much compute Claude spends on its response. Lower effort means faster, cheaper responses; higher effort means more thorough reasoning.spring-doc.cn

Effort Level Description

LOWspring-doc.cn

Fast and concise responses with minimal reasoningspring-doc.cn

MEDIUMspring-doc.cn

Balanced trade-off between speed and thoroughnessspring-doc.cn

HIGHspring-doc.cn

More thorough reasoning and detailed responsesspring-doc.cn

MAXspring-doc.cn

Maximum compute for the most thorough possible responsesspring-doc.cn

var options = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-6")
    .effort(OutputConfig.Effort.LOW)
    .build();

ChatResponse response = chatModel.call(new Prompt("What is the capital of France?", options));

Combined Schema and Effort

You can combine JSON schema output with effort control:spring-doc.cn

var options = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-6")
    .outputSchema("""
        {
            "type": "object",
            "properties": {
                "answer": {"type": "integer"},
                "explanation": {"type": "string"}
            },
            "required": ["answer", "explanation"],
            "additionalProperties": false
        }
        """)
    .effort(OutputConfig.Effort.HIGH)
    .build();

ChatResponse response = chatModel.call(
    new Prompt("What is 15 * 23? Show your reasoning.", options));

Direct OutputConfig

For full control, use the SDK’s OutputConfig directly:spring-doc.cn

import com.anthropic.models.messages.OutputConfig;
import com.anthropic.models.messages.JsonOutputFormat;
import com.anthropic.core.JsonValue;

var outputConfig = OutputConfig.builder()
    .effort(OutputConfig.Effort.HIGH)
    .format(JsonOutputFormat.builder()
        .schema(JsonOutputFormat.Schema.builder()
            .putAdditionalProperty("type", JsonValue.from("object"))
            .putAdditionalProperty("properties", JsonValue.from(Map.of(
                "name", Map.of("type", "string"))))
            .putAdditionalProperty("additionalProperties", JsonValue.from(false))
            .build())
        .build())
    .build();

var options = AnthropicChatOptions.builder()
    .model("claude-sonnet-4-6")
    .outputConfig(outputConfig)
    .build();

ChatResponse response = chatModel.call(new Prompt("Tell me about France.", options));

StructuredOutputChatOptions Interface

AnthropicChatOptions implements the StructuredOutputChatOptions interface, which provides portable getOutputSchema() and setOutputSchema(String) methods. This allows structured output to work with Spring AI’s generic structured output infrastructure.spring-doc.cn

Per-Request HTTP Headers

The Anthropic SDK module supports per-request HTTP headers, which are injected into individual API calls. This is distinct from customHeaders (which are set at the client level for all requests).spring-doc.cn

Per-request headers are useful for:spring-doc.cn

  • Request tracking: Adding correlation IDs or trace headers per requestspring-doc.cn

  • Beta API access: Including beta feature headers for specific requestsspring-doc.cn

  • Routing: Adding routing or priority headers for load balancingspring-doc.cn

var options = AnthropicChatOptions.builder()
    .httpHeaders(Map.of(
        "X-Request-Id", "req-12345",
        "X-Custom-Tracking", "my-tracking-value"))
    .build();

ChatResponse response = chatModel.call(new Prompt("Hello", options));
httpHeaders are per-request and set via MessageCreateParams.putAdditionalHeader(). They do not affect other requests. For headers that should apply to all requests, use customHeaders instead.

Sample Controller

Here is an example of a simple @RestController class that uses the chat model for text generations:spring-doc.cn

@RestController
public class ChatController {

    private final AnthropicChatModel chatModel;

    public ChatController() {
        var options = AnthropicChatOptions.builder()
            .model("claude-sonnet-4-20250514")
            .maxTokens(1024)
            .apiKey(System.getenv("ANTHROPIC_API_KEY"))
            .build();
        this.chatModel = new AnthropicChatModel(options);
    }

    @GetMapping("/ai/generate")
    public Map<String, String> generate(
            @RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        return Map.of("generation", chatModel.call(message));
    }

    @GetMapping("/ai/generateStream")
    public Flux<ChatResponse> generateStream(
            @RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        Prompt prompt = new Prompt(new UserMessage(message));
        return chatModel.stream(prompt);
    }
}

Accessing the Raw Response

The full Anthropic SDK Message object is available in the response metadata under the "anthropic-response" key. This provides access to any fields not explicitly mapped by Spring AI’s abstraction:spring-doc.cn

ChatResponse response = chatModel.call(new Prompt("Hello"));

com.anthropic.models.messages.Message rawMessage =
    (com.anthropic.models.messages.Message) response.getMetadata().get("anthropic-response");

// Access native SDK fields
rawMessage.stopReason();    // Optional<StopReason>
rawMessage.content();       // List<ContentBlock>
rawMessage.usage();         // Usage with cache token details
The raw response is available for synchronous calls only. Streaming responses do not include it.

Skills

Anthropic’s Skills API extends Claude’s capabilities with specialized, pre-packaged abilities for document generation. Skills enable Claude to create actual downloadable files — Excel spreadsheets, PowerPoint presentations, Word documents, and PDFs — rather than just describing what these documents might contain.spring-doc.cn

Supported Modelsspring-doc.cn

Skills are supported on Claude Sonnet 4, Claude Sonnet 4.5, Claude Opus 4, and later models.spring-doc.cn

Requirementsspring-doc.cn

  • Skills require the code execution capability (automatically enabled by Spring AI when skills are configured)spring-doc.cn

  • Maximum of 8 skills per requestspring-doc.cn

  • Generated files are available for download via the Files API for 24 hoursspring-doc.cn

Pre-built Anthropic Skills

Spring AI provides type-safe access to Anthropic’s pre-built skills through the AnthropicSkill enum:spring-doc.cn

Skill Description Generated File Type

XLSXspring-doc.cn

Excel spreadsheet generation and manipulationspring-doc.cn

.xlsx (Microsoft Excel)spring-doc.cn

PPTXspring-doc.cn

PowerPoint presentation creationspring-doc.cn

.pptx (Microsoft PowerPoint)spring-doc.cn

DOCXspring-doc.cn

Word document generationspring-doc.cn

.docx (Microsoft Word)spring-doc.cn

PDFspring-doc.cn

PDF document creationspring-doc.cn

.pdf (Portable Document Format)spring-doc.cn

Basic Usage

Enable skills by adding them to your AnthropicChatOptions:spring-doc.cn

ChatResponse response = chatModel.call(
    new Prompt(
        "Create an Excel spreadsheet with Q1 2025 sales data. " +
        "Include columns for Month, Revenue, and Expenses with 3 rows of sample data.",
        AnthropicChatOptions.builder()
            .model(Model.CLAUDE_SONNET_4_5)
            .maxTokens(4096)
            .skill(AnthropicSkill.XLSX)
            .build()
    )
);

// Claude will generate an actual Excel file
String responseText = response.getResult().getOutput().getText();
System.out.println(responseText);
// Output: "I've created an Excel spreadsheet with your Q1 2025 sales data..."

Multiple Skills

You can enable multiple skills in a single request (up to 8):spring-doc.cn

ChatResponse response = chatModel.call(
    new Prompt(
        "Create a sales report with both an Excel file containing the raw data " +
        "and a PowerPoint presentation summarizing the key findings.",
        AnthropicChatOptions.builder()
            .model(Model.CLAUDE_SONNET_4_5)
            .maxTokens(8192)
            .skill(AnthropicSkill.XLSX)
            .skill(AnthropicSkill.PPTX)
            .build()
    )
);

Using AnthropicSkillContainer for Advanced Configuration

For more control over skill types and versions, use AnthropicSkillContainer directly:spring-doc.cn

AnthropicSkillContainer container = AnthropicSkillContainer.builder()
    .skill(AnthropicSkill.XLSX)
    .skill(AnthropicSkill.PPTX, "20251013") // Specific version
    .build();

ChatResponse response = chatModel.call(
    new Prompt(
        "Generate the quarterly report",
        AnthropicChatOptions.builder()
            .model(Model.CLAUDE_SONNET_4_5)
            .maxTokens(4096)
            .skillContainer(container)
            .build()
    )
);

Downloading Generated Files

When Claude generates files using Skills, the response contains file IDs that can be used to download the actual files via the Files API. Spring AI provides the AnthropicSkillsResponseHelper utility class for extracting file IDs and downloading files.spring-doc.cn

Extracting File IDs

import org.springframework.ai.anthropic.AnthropicSkillsResponseHelper;

ChatResponse response = chatModel.call(prompt);

// Extract all file IDs from the response
List<String> fileIds = AnthropicSkillsResponseHelper.extractFileIds(response);

for (String fileId : fileIds) {
    System.out.println("Generated file ID: " + fileId);
}

Downloading All Files

The AnthropicSkillsResponseHelper provides a convenience method to download all generated files at once. This requires the AnthropicClient instance (the same one used to create the chat model):spring-doc.cn

import com.anthropic.client.AnthropicClient;

@Autowired
private AnthropicClient anthropicClient;

// Download all files to a target directory
Path targetDir = Path.of("generated-files");
Files.createDirectories(targetDir);

List<Path> savedFiles = AnthropicSkillsResponseHelper.downloadAllFiles(
        response, anthropicClient, targetDir);

for (Path file : savedFiles) {
    System.out.println("Downloaded: " + file.getFileName() +
                       " (" + Files.size(file) + " bytes)");
}

Extracting Container ID

For multi-turn conversations with Skills, you can extract the container ID for reuse:spring-doc.cn

String containerId = AnthropicSkillsResponseHelper.extractContainerId(response);

if (containerId != null) {
    System.out.println("Container ID for reuse: " + containerId);
}

Complete Example

Here’s a complete example showing Skills usage with file download:spring-doc.cn

@Service
public class DocumentGenerationService {

    private final AnthropicChatModel chatModel;
    private final AnthropicClient anthropicClient;

    public DocumentGenerationService(AnthropicChatModel chatModel,
                                     AnthropicClient anthropicClient) {
        this.chatModel = chatModel;
        this.anthropicClient = anthropicClient;
    }

    public Path generateSalesReport(String quarter, Path outputDir) throws IOException {
        // Generate Excel report using Skills
        ChatResponse response = chatModel.call(
            new Prompt(
                "Create an Excel spreadsheet with " + quarter + " sales data. " +
                "Include Month, Revenue, Expenses, and Profit columns.",
                AnthropicChatOptions.builder()
                    .model(Model.CLAUDE_SONNET_4_5)
                    .maxTokens(4096)
                    .skill(AnthropicSkill.XLSX)
                    .build()
            )
        );

        // Extract file IDs from the response
        List<String> fileIds = AnthropicSkillsResponseHelper.extractFileIds(response);

        if (fileIds.isEmpty()) {
            throw new RuntimeException("No file was generated");
        }

        // Download all generated files
        List<Path> savedFiles = AnthropicSkillsResponseHelper.downloadAllFiles(
                response, anthropicClient, outputDir);

        return savedFiles.get(0);
    }
}

Best Practices

  1. Use appropriate models: Skills work best with Claude Sonnet 4 and later models. Ensure you’re using a supported model.spring-doc.cn

  2. Set sufficient max tokens: Document generation can require significant tokens. Use maxTokens(4096) or higher for complex documents.spring-doc.cn

  3. Be specific in prompts: Provide clear, detailed instructions about document structure, content, and formatting.spring-doc.cn

  4. Handle file downloads promptly: Generated files expire after 24 hours. Download files soon after generation.spring-doc.cn

  5. Check for file IDs: Always verify that file IDs were returned before attempting downloads. Some prompts may result in text responses without file generation.spring-doc.cn

  6. Use defensive error handling: Wrap file operations in try-catch blocks to handle network issues or expired files gracefully.spring-doc.cn

List<String> fileIds = AnthropicSkillsResponseHelper.extractFileIds(response);

if (fileIds.isEmpty()) {
    // Claude may have responded with text instead of generating a file
    String text = response.getResult().getOutput().getText();
    log.warn("No files generated. Response: {}", text);
    return;
}

try {
    List<Path> files = AnthropicSkillsResponseHelper.downloadAllFiles(
            response, anthropicClient, targetDir);
    // Process files...
} catch (IOException e) {
    log.error("Failed to download file: {}", e.getMessage());
}

Observability

The Anthropic SDK implementation supports Spring AI’s observability features through Micrometer. All chat model operations are instrumented for monitoring and tracing.spring-doc.cn

Logging

Enable SDK logging by setting the environment variable:spring-doc.cn

export ANTHROPIC_LOG=debug

Limitations

The following features are not yet supported:spring-doc.cn

These features are planned for future releases.spring-doc.cn