30 KiB
Create
$ ant messages create
post /v1/messages
Send a structured list of input messages with text and/or image content, and the model will generate the next message in the conversation.
The Messages API can be used for either single queries or stateless multi-turn conversations.
Learn more about the Messages API in our user guide
Parameters
-
--max-tokens: numberThe maximum number of tokens to generate before stopping.
Note that our models may stop before reaching this maximum. This parameter only specifies the absolute maximum number of tokens to generate.
Different models have different maximum values for this parameter. See models for details.
-
--message: array of MessageParamInput messages.
Our models are trained to operate on alternating
userandassistantconversational turns. When creating a newMessage, you specify the prior conversational turns with themessagesparameter, and the model then generates the nextMessagein the conversation. Consecutiveuserorassistantturns in your request will be combined into a single turn.Each input message must be an object with a
roleandcontent. You can specify a singleuser-role message, or you can include multipleuserandassistantmessages.If the final message uses the
assistantrole, the response content will continue immediately from the content in that message. This can be used to constrain part of the model's response.Example with a single
usermessage:[{"role": "user", "content": "Hello, Claude"}]Example with multiple conversational turns:
[ {"role": "user", "content": "Hello there."}, {"role": "assistant", "content": "Hi, I'm Claude. How can I help you?"}, {"role": "user", "content": "Can you explain LLMs in plain English?"}, ]Example with a partially-filled response from Claude:
[ {"role": "user", "content": "What's the Greek name for Sun? (A) Sol (B) Helios (C) Sun"}, {"role": "assistant", "content": "The best answer is ("}, ]Each input message
contentmay be either a singlestringor an array of content blocks, where each block has a specifictype. Using astringforcontentis shorthand for an array of one content block of type"text". The following input messages are equivalent:{"role": "user", "content": "Hello, Claude"}{"role": "user", "content": [{"type": "text", "text": "Hello, Claude"}]}See input examples.
Note that if you want to include a system prompt, you can use the top-level
systemparameter — there is no"system"role for input messages in the Messages API.There is a limit of 100,000 messages in a single request.
-
--model: "claude-opus-4-7" or "claude-mythos-preview" or "claude-opus-4-6" or 14 more or stringThe model that will complete your prompt.
See models for additional details and options.
-
--cache-control: optional object { type, ttl }Top-level cache control automatically applies a cache_control marker to the last cacheable block in the request.
-
--container: optional stringContainer identifier for reuse across requests.
-
--inference-geo: optional stringSpecifies the geographic region for inference processing. If not specified, the workspace's
default_inference_geois used. -
--metadata: optional object { user_id }An object describing metadata about the request.
-
--output-config: optional object { effort, format }Configuration options for the model's output, such as the output format.
-
--service-tier: optional "auto" or "standard_only"Determines whether to use priority capacity (if available) or standard capacity for this request.
Anthropic offers different levels of service for your API requests. See service-tiers for details.
-
--stop-sequence: optional array of stringCustom text sequences that will cause the model to stop generating.
Our models will normally stop when they have naturally completed their turn, which will result in a response
stop_reasonof"end_turn".If you want the model to stop generating when it encounters custom strings of text, you can use the
stop_sequencesparameter. If the model encounters one of the custom sequences, the responsestop_reasonvalue will be"stop_sequence"and the responsestop_sequencevalue will contain the matched stop sequence. -
--system: optional string or array of TextBlockParamSystem prompt.
A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role. See our guide to system prompts.
-
--temperature: optional numberAmount of randomness injected into the response.
Defaults to
1.0. Ranges from0.0to1.0. Usetemperaturecloser to0.0for analytical / multiple choice, and closer to1.0for creative and generative tasks.Note that even with
temperatureof0.0, the results will not be fully deterministic. -
--thinking: optional ThinkingConfigEnabled or ThinkingConfigDisabled or ThinkingConfigAdaptiveConfiguration for enabling Claude's extended thinking.
When enabled, responses include
thinkingcontent blocks showing Claude's thinking process before the final answer. Requires a minimum budget of 1,024 tokens and counts towards yourmax_tokenslimit.See extended thinking for details.
-
--tool-choice: optional ToolChoiceAuto or ToolChoiceAny or ToolChoiceTool or ToolChoiceNoneHow the model should use the provided tools. The model can use a specific tool, any available tool, decide by itself, or not use tools at all.
-
--tool: optional array of ToolUnionDefinitions of tools that the model may use.
If you include
toolsin your API request, the model may returntool_usecontent blocks that represent the model's use of those tools. You can then run those tools using the tool input generated by the model and then optionally return results back to the model usingtool_resultcontent blocks.There are two types of tools: client tools and server tools. The behavior described below applies to client tools. For server tools, see their individual documentation as each has its own behavior (e.g., the web search tool).
Each tool definition includes:
name: Name of the tool.description: Optional, but strongly-recommended description of the tool.input_schema: JSON schema for the toolinputshape that the model will produce intool_useoutput content blocks.
For example, if you defined
toolsas:[ { "name": "get_stock_price", "description": "Get the current stock price for a given ticker symbol.", "input_schema": { "type": "object", "properties": { "ticker": { "type": "string", "description": "The stock ticker symbol, e.g. AAPL for Apple Inc." } }, "required": ["ticker"] } } ]And then asked the model "What's the S&P 500 at today?", the model might produce
tool_usecontent blocks in the response like this:[ { "type": "tool_use", "id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV", "name": "get_stock_price", "input": { "ticker": "^GSPC" } } ]You might then run your
get_stock_pricetool with{"ticker": "^GSPC"}as an input, and return the following back to the model in a subsequentusermessage:[ { "type": "tool_result", "tool_use_id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV", "content": "259.75 USD" } ]Tools can be used for workflows that include running client-side tools and functions, or more generally whenever you want the model to produce a particular JSON structure of output.
See our guide for more details.
-
--top-k: optional numberOnly sample from the top K options for each subsequent token.
Used to remove "long tail" low probability responses. Learn more technical details here.
Recommended for advanced use cases only. You usually only need to use
temperature. -
--top-p: optional numberUse nucleus sampling.
In nucleus sampling, we compute the cumulative distribution over all the options for each subsequent token in decreasing probability order and cut it off once it reaches a particular probability specified by
top_p. You should either altertemperatureortop_p, but not both.Recommended for advanced use cases only. You usually only need to use
temperature.
Returns
-
message: object { id, container, content, 7 more }-
id: stringUnique object identifier.
The format and length of IDs may change over time.
-
container: object { id, expires_at }Information about the container used in the request (for the code execution tool)
-
id: stringIdentifier for the container used in this request
-
expires_at: stringThe time at which the container will expire.
-
-
content: array of ContentBlockContent generated by the model.
This is an array of content blocks, each of which has a
typethat determines its shape.Example:
[{"type": "text", "text": "Hi, I'm Claude."}]If the request input
messagesended with anassistantturn, then the responsecontentwill continue directly from that last turn. You can use this to constrain the model's output.For example, if the input
messageswere:[ {"role": "user", "content": "What's the Greek name for Sun? (A) Sol (B) Helios (C) Sun"}, {"role": "assistant", "content": "The best answer is ("} ]Then the response
contentmight be:[{"type": "text", "text": "B)"}]-
text_block: object { citations, text, type }-
citations: array of TextCitationCitations supporting the text block.
The type of citation returned will depend on the type of document being cited. Citing a PDF results in
page_location, plain text results inchar_location, and content document results incontent_block_location.-
citation_char_location: object { cited_text, document_index, document_title, 4 more }-
cited_text: string -
document_index: number -
document_title: string -
end_char_index: number -
file_id: string -
start_char_index: number -
type: "char_location"
-
-
citation_page_location: object { cited_text, document_index, document_title, 4 more }-
cited_text: string -
document_index: number -
document_title: string -
end_page_number: number -
file_id: string -
start_page_number: number -
type: "page_location"
-
-
citation_content_block_location: object { cited_text, document_index, document_title, 4 more }-
cited_text: string -
document_index: number -
document_title: string -
end_block_index: number -
file_id: string -
start_block_index: number -
type: "content_block_location"
-
-
citations_web_search_result_location: object { cited_text, encrypted_index, title, 2 more }-
cited_text: string -
encrypted_index: string -
title: string -
type: "web_search_result_location" -
url: string
-
-
citations_search_result_location: object { cited_text, end_block_index, search_result_index, 4 more }-
cited_text: string -
end_block_index: number -
search_result_index: number -
source: string -
start_block_index: number -
title: string -
type: "search_result_location"
-
-
-
text: string -
type: "text"
-
-
thinking_block: object { signature, thinking, type }-
signature: string -
thinking: string -
type: "thinking"
-
-
redacted_thinking_block: object { data, type }-
data: string -
type: "redacted_thinking"
-
-
tool_use_block: object { id, caller, input, 2 more }-
id: string -
caller: DirectCaller or ServerToolCaller or ServerToolCaller20260120Tool invocation directly from the model.
-
direct_caller: object { type }Tool invocation directly from the model.
type: "direct"
-
server_tool_caller: object { tool_id, type }Tool invocation generated by a server-side tool.
-
tool_id: string -
type: "code_execution_20250825"
-
-
server_tool_caller_20260120: object { tool_id, type }-
tool_id: string -
type: "code_execution_20260120"
-
-
-
input: map[unknown] -
name: string -
type: "tool_use"
-
-
server_tool_use_block: object { id, caller, input, 2 more }-
id: string -
caller: DirectCaller or ServerToolCaller or ServerToolCaller20260120Tool invocation directly from the model.
-
direct_caller: object { type }Tool invocation directly from the model.
type: "direct"
-
server_tool_caller: object { tool_id, type }Tool invocation generated by a server-side tool.
-
tool_id: string -
type: "code_execution_20250825"
-
-
server_tool_caller_20260120: object { tool_id, type }-
tool_id: string -
type: "code_execution_20260120"
-
-
-
input: map[unknown] -
name: "web_search" or "web_fetch" or "code_execution" or 4 more-
"web_search" -
"web_fetch" -
"code_execution" -
"bash_code_execution" -
"text_editor_code_execution" -
"tool_search_tool_regex" -
"tool_search_tool_bm25"
-
-
type: "server_tool_use"
-
-
web_search_tool_result_block: object { caller, content, tool_use_id, type }-
caller: DirectCaller or ServerToolCaller or ServerToolCaller20260120Tool invocation directly from the model.
-
direct_caller: object { type }Tool invocation directly from the model.
type: "direct"
-
server_tool_caller: object { tool_id, type }Tool invocation generated by a server-side tool.
-
tool_id: string -
type: "code_execution_20250825"
-
-
server_tool_caller_20260120: object { tool_id, type }-
tool_id: string -
type: "code_execution_20260120"
-
-
-
content: WebSearchToolResultError or array of WebSearchResultBlock-
web_search_tool_result_error: object { error_code, type }-
error_code: "invalid_tool_input" or "unavailable" or "max_uses_exceeded" or 3 more-
"invalid_tool_input" -
"unavailable" -
"max_uses_exceeded" -
"too_many_requests" -
"query_too_long" -
"request_too_large"
-
-
type: "web_search_tool_result_error"
-
-
union_member_1: array of WebSearchResultBlock-
encrypted_content: string -
page_age: string -
title: string -
type: "web_search_result" -
url: string
-
-
-
tool_use_id: string -
type: "web_search_tool_result"
-
-
web_fetch_tool_result_block: object { caller, content, tool_use_id, type }-
caller: DirectCaller or ServerToolCaller or ServerToolCaller20260120Tool invocation directly from the model.
-
direct_caller: object { type }Tool invocation directly from the model.
type: "direct"
-
server_tool_caller: object { tool_id, type }Tool invocation generated by a server-side tool.
-
tool_id: string -
type: "code_execution_20250825"
-
-
server_tool_caller_20260120: object { tool_id, type }-
tool_id: string -
type: "code_execution_20260120"
-
-
-
content: WebFetchToolResultErrorBlock or WebFetchBlock-
web_fetch_tool_result_error_block: object { error_code, type }-
error_code: "invalid_tool_input" or "url_too_long" or "url_not_allowed" or 5 more-
"invalid_tool_input" -
"url_too_long" -
"url_not_allowed" -
"url_not_accessible" -
"unsupported_content_type" -
"too_many_requests" -
"max_uses_exceeded" -
"unavailable"
-
-
type: "web_fetch_tool_result_error"
-
-
web_fetch_block: object { content, retrieved_at, type, url }-
content: object { citations, source, title, type }-
citations: object { enabled }Citation configuration for the document
enabled: boolean
-
source: Base64PDFSource or PlainTextSource-
base64_pdf_source: object { data, media_type, type }-
data: string -
media_type: "application/pdf" -
type: "base64"
-
-
plain_text_source: object { data, media_type, type }-
data: string -
media_type: "text/plain" -
type: "text"
-
-
-
title: stringThe title of the document
-
type: "document"
-
-
retrieved_at: stringISO 8601 timestamp when the content was retrieved
-
type: "web_fetch_result" -
url: stringFetched content URL
-
-
-
tool_use_id: string -
type: "web_fetch_tool_result"
-
-
code_execution_tool_result_block: object { content, tool_use_id, type }-
content: CodeExecutionToolResultError or CodeExecutionResultBlock or EncryptedCodeExecutionResultBlockCode execution result with encrypted stdout for PFC + web_search results.
-
code_execution_tool_result_error: object { error_code, type }-
error_code: "invalid_tool_input" or "unavailable" or "too_many_requests" or "execution_time_exceeded"-
"invalid_tool_input" -
"unavailable" -
"too_many_requests" -
"execution_time_exceeded"
-
-
type: "code_execution_tool_result_error"
-
-
code_execution_result_block: object { content, return_code, stderr, 2 more }-
content: array of CodeExecutionOutputBlock-
file_id: string -
type: "code_execution_output"
-
-
return_code: number -
stderr: string -
stdout: string -
type: "code_execution_result"
-
-
encrypted_code_execution_result_block: object { content, encrypted_stdout, return_code, 2 more }Code execution result with encrypted stdout for PFC + web_search results.
-
content: array of CodeExecutionOutputBlock-
file_id: string -
type: "code_execution_output"
-
-
encrypted_stdout: string -
return_code: number -
stderr: string -
type: "encrypted_code_execution_result"
-
-
-
tool_use_id: string -
type: "code_execution_tool_result"
-
-
bash_code_execution_tool_result_block: object { content, tool_use_id, type }-
content: BashCodeExecutionToolResultError or BashCodeExecutionResultBlock-
bash_code_execution_tool_result_error: object { error_code, type }-
error_code: "invalid_tool_input" or "unavailable" or "too_many_requests" or 2 more-
"invalid_tool_input" -
"unavailable" -
"too_many_requests" -
"execution_time_exceeded" -
"output_file_too_large"
-
-
type: "bash_code_execution_tool_result_error"
-
-
bash_code_execution_result_block: object { content, return_code, stderr, 2 more }-
content: array of BashCodeExecutionOutputBlock-
file_id: string -
type: "bash_code_execution_output"
-
-
return_code: number -
stderr: string -
stdout: string -
type: "bash_code_execution_result"
-
-
-
tool_use_id: string -
type: "bash_code_execution_tool_result"
-
-
text_editor_code_execution_tool_result_block: object { content, tool_use_id, type }-
content: TextEditorCodeExecutionToolResultError or TextEditorCodeExecutionViewResultBlock or TextEditorCodeExecutionCreateResultBlock or TextEditorCodeExecutionStrReplaceResultBlock-
text_editor_code_execution_tool_result_error: object { error_code, error_message, type }-
error_code: "invalid_tool_input" or "unavailable" or "too_many_requests" or 2 more-
"invalid_tool_input" -
"unavailable" -
"too_many_requests" -
"execution_time_exceeded" -
"file_not_found"
-
-
error_message: string -
type: "text_editor_code_execution_tool_result_error"
-
-
text_editor_code_execution_view_result_block: object { content, file_type, num_lines, 3 more }-
content: string -
file_type: "text" or "image" or "pdf"-
"text" -
"image" -
"pdf"
-
-
num_lines: number -
start_line: number -
total_lines: number -
type: "text_editor_code_execution_view_result"
-
-
text_editor_code_execution_create_result_block: object { is_file_update, type }-
is_file_update: boolean -
type: "text_editor_code_execution_create_result"
-
-
text_editor_code_execution_str_replace_result_block: object { lines, new_lines, new_start, 3 more }-
lines: array of string -
new_lines: number -
new_start: number -
old_lines: number -
old_start: number -
type: "text_editor_code_execution_str_replace_result"
-
-
-
tool_use_id: string -
type: "text_editor_code_execution_tool_result"
-
-
tool_search_tool_result_block: object { content, tool_use_id, type }-
content: ToolSearchToolResultError or ToolSearchToolSearchResultBlock-
tool_search_tool_result_error: object { error_code, error_message, type }-
error_code: "invalid_tool_input" or "unavailable" or "too_many_requests" or "execution_time_exceeded"-
"invalid_tool_input" -
"unavailable" -
"too_many_requests" -
"execution_time_exceeded"
-
-
error_message: string -
type: "tool_search_tool_result_error"
-
-
tool_search_tool_search_result_block: object { tool_references, type }-
tool_references: array of ToolReferenceBlock-
tool_name: string -
type: "tool_reference"
-
-
type: "tool_search_tool_search_result"
-
-
-
tool_use_id: string -
type: "tool_search_tool_result"
-
-
container_upload_block: object { file_id, type }Response model for a file uploaded to the container.
-
file_id: string -
type: "container_upload"
-
-
-
model: "claude-opus-4-7" or "claude-mythos-preview" or "claude-opus-4-6" or 14 more or stringThe model that will complete your prompt.
See models for additional details and options.
-
"claude-opus-4-7"Frontier intelligence for long-running agents and coding
-
"claude-mythos-preview"New class of intelligence, strongest in coding and cybersecurity
-
"claude-opus-4-6"Frontier intelligence for long-running agents and coding
-
"claude-sonnet-4-6"Best combination of speed and intelligence
-
"claude-haiku-4-5"Fastest model with near-frontier intelligence
-
"claude-haiku-4-5-20251001"Fastest model with near-frontier intelligence
-
"claude-opus-4-5"Premium model combining maximum intelligence with practical performance
-
"claude-opus-4-5-20251101"Premium model combining maximum intelligence with practical performance
-
"claude-sonnet-4-5"High-performance model for agents and coding
-
"claude-sonnet-4-5-20250929"High-performance model for agents and coding
-
"claude-opus-4-1"Exceptional model for specialized complex tasks
-
"claude-opus-4-1-20250805"Exceptional model for specialized complex tasks
-
"claude-opus-4-0"Powerful model for complex tasks
-
"claude-opus-4-20250514"Powerful model for complex tasks
-
"claude-sonnet-4-0"High-performance model with extended thinking
-
"claude-sonnet-4-20250514"High-performance model with extended thinking
-
"claude-3-haiku-20240307"Fast and cost-effective model
-
-
role: "assistant"Conversational role of the generated message.
This will always be
"assistant". -
stop_details: object { category, explanation, type }Structured information about a refusal.
-
category: "cyber" or "bio"The policy category that triggered the refusal.
nullwhen the refusal doesn't map to a named category.-
"cyber" -
"bio"
-
-
explanation: stringHuman-readable explanation of the refusal.
This text is not guaranteed to be stable.
nullwhen no explanation is available for the category. -
type: "refusal"
-
-
stop_reason: "end_turn" or "max_tokens" or "stop_sequence" or 3 moreThe reason that we stopped.
This may be one the following values:
"end_turn": the model reached a natural stopping point"max_tokens": we exceeded the requestedmax_tokensor the model's maximum"stop_sequence": one of your provided customstop_sequenceswas generated"tool_use": the model invoked one or more tools"pause_turn": we paused a long-running turn. You may provide the response back as-is in a subsequent request to let the model continue."refusal": when streaming classifiers intervene to handle potential policy violations
In non-streaming mode this value is always non-null. In streaming mode, it is null in the
message_startevent and non-null otherwise.-
"end_turn" -
"max_tokens" -
"stop_sequence" -
"tool_use" -
"pause_turn" -
"refusal"
-
stop_sequence: stringWhich custom stop sequence was generated, if any.
This value will be a non-null string if one of your custom stop sequences was generated.
-
type: "message"Object type.
For Messages, this is always
"message". -
usage: object { cache_creation, cache_creation_input_tokens, cache_read_input_tokens, 5 more }Billing and rate-limit usage.
Anthropic's API bills and rate-limits by token counts, as tokens represent the underlying cost to our systems.
Under the hood, the API transforms requests into a format suitable for the model. The model's output then goes through a parsing stage before becoming an API response. As a result, the token counts in
usagewill not match one-to-one with the exact visible content of an API request or response.For example,
output_tokenswill be non-zero, even for an empty string response from Claude.Total input tokens in a request is the summation of
input_tokens,cache_creation_input_tokens, andcache_read_input_tokens.-
cache_creation: object { ephemeral_1h_input_tokens, ephemeral_5m_input_tokens }Breakdown of cached tokens by TTL
-
ephemeral_1h_input_tokens: numberThe number of input tokens used to create the 1 hour cache entry.
-
ephemeral_5m_input_tokens: numberThe number of input tokens used to create the 5 minute cache entry.
-
-
cache_creation_input_tokens: numberThe number of input tokens used to create the cache entry.
-
cache_read_input_tokens: numberThe number of input tokens read from the cache.
-
inference_geo: stringThe geographic region where inference was performed for this request.
-
input_tokens: numberThe number of input tokens which were used.
-
output_tokens: numberThe number of output tokens which were used.
-
server_tool_use: object { web_fetch_requests, web_search_requests }The number of server tool requests.
-
web_fetch_requests: numberThe number of web fetch tool requests.
-
web_search_requests: numberThe number of web search tool requests.
-
-
service_tier: "standard" or "priority" or "batch"If the request used the priority, standard, or batch tier.
-
"standard" -
"priority" -
"batch"
-
-
-
Example
ant messages create \
--api-key my-anthropic-api-key \
--max-tokens 1024 \
--message '{content: [{text: x, type: text}], role: user}' \
--model claude-opus-4-6