Model configuration in Elastic Agent Builder

Elastic Agent Builder uses large language models (LLMs) to power agent reasoning and decision-making.

For Elastic Cloud Serverless projects and Elastic Cloud Hosted deployments, Elastic Agent Builder uses Elastic Managed LLMs running on the Elastic Inference Service (EIS). This managed service requires zero setup.

Default model configuration

You can get started with zero setup using Elastic Managed LLMs. These are built-in LLMs running on the Elastic Inference Service (EIS). This managed service requires no additional API key management.

Note

Learn more about Elastic Managed LLMs and pricing.

These deployments do not include a preconfigured connector. To use Elastic Agent Builder, you have two options:

Switch models in the UI

Use the model selector in the chat interface to switch between available models. The selector displays all configured models, including preconfigured models (on Elastic Cloud Hosted and Elastic Cloud Serverless) and any custom connectors you set up.

To learn more, refer to select a different model.

Model selector dropdown in the chat interface

Change the default model

To change which model is used by default:

Search for GenAI Settings in the global search field.
Select your preferred connector from the Default AI Connector dropdown.
Save your changes.

Use additional models

To use additional models that aren't preconfigured, create a connector for your model provider.

Create a connector in the UI

To create a new connector:

Find connectors under Alerts and Insights / Connectors in the global search bar.
Select Create Connector and select your model provider.
Configure the connector with your API credentials and preferred model.
Expand Additional settings and select chat_completion as the task type.

×

Tip

For detailed instructions on creating connectors, refer to Connectors.

To learn about preconfigured connectors, refer to preconfigured connectors.

Create connectors with the API

To create connectors programmatically, refer to the Connectors API documentation.

Connect a local LLM

You can connect a locally hosted LLM to Elastic using the OpenAI connector. This requires your local LLM to be compatible with the OpenAI API format.

For detailed setup instructions, refer to the OpenAI connector documentation.

Model requirements

Elastic Agent Builder requires models with strong reasoning and tool-calling capabilities. State-of-the-art models perform significantly better than smaller or older models.

Agent Builder relies on advanced LLM capabilities including:

Function calling: Models must accurately select appropriate tools and construct valid parameters from natural language requests.
Multi-step reasoning: Agents need to plan, execute, and adapt based on tool results across multiple iterations.
Structured output: Models must produce properly formatted responses that the agent framework can parse.

While Elastic offers LLM connectors for many different vendors and models, not all LLMs are robust enough to be used with Elastic Agent Builder.

Recommended models

The following models are known to work well with Elastic Agent Builder. These categories represent a spectrum from maximum reasoning capability to maximum throughput. Choose based on your latency, cost, and complexity requirements.

Category	Model examples	Use cases	Trade-offs
Extended reasoning	- Gemini 3 Pro - Claude 4.5 Opus	Open-ended exploration, multi-step planning, and complex analysis	Higher latency and cost; best for latency-insensitive, batch, or async workflows
Balanced performance	- GPT-5.2 - Claude 4.5 Sonnet	General-purpose agents requiring reliable tool orchestration and data retrieval and synthesis	Moderate cost; suitable for real-time and interactive use
High throughput	GPT-OSS-120B	Latency-sensitive pipelines and high-concurrency scenarios with well-scoped tasks	Lower reasoning depth; smaller context window; ideal for air-gapped deployments

Tip

For agents working with large documents or conversation histories, consider models with extended context windows. For example, Claude 4.5 Sonnet and Gemini 3 Pro support up to 1M tokens. Check your model provider's documentation for specific context limits.

Incompatible models

Smaller or less capable models may produce errors like:

Error: Invalid function call syntax

Error executing agent: No tool calls found in the response.

While any chat-completion-compatible connector can technically be configured, we strongly recommend using state-of-the-art models for reliable agent performance.

Note

Smaller or "mini" model variants are not recommended for Elastic Agent Builder as they lack the necessary capabilities for reliable agent workflows.

Limitations and known issues: Current limitations around model selection
Get started: Initial setup and configuration
Connectors: Detailed connector configuration guide