Scale Kibana for your traffic workload

ECH ECK ECE Self-Managed

Important

This guidance does not apply to scaling Kibana for task manager. If you intend to optimize Kibana for alerting capabilities, see Kibana task manager: performance and scaling guide.

Kibana's HTTP traffic is diverse and can be unpredictable. Traffic includes serving static assets like files, processing large search responses from Elasticsearch, and managing CRUD operations against complex domain objects like SLOs. The scale of the load created by each of these kinds of traffic will vary depending on your usage patterns. While difficult to predict, there are two important aspects to consider when provisioning CPU and memory resources for your Kibana instances:

Concurrency: How many users you expect to be interacting with Kibana simultaneously. Concurrency performance is largely CPU-bound. Approaching this limit increases response times.
Request and response size: The size of requests and responses you expect Kibana to service. Performance when managing large requests and responses is largely memory-bound. Approaching this limit increases response times and may cause Kibana to crash.

Tip

On Elastic Cloud Serverless scaling Kibana is fully managed for you.

CPU and memory boundedness often interact in important ways. If CPU-bound activity is reaching its limit, memory pressure will likely increase as Kibana has less time for activities like garbage collection. If memory-bound activity is reaching its limit, there may be more CPU work to free claimed memory, increasing CPU pressure. Tracking CPU and memory metrics over time can be very useful for understanding where your Kibana is experiencing a bottleneck.

Note

Traffic to Kibana often comes in short bursts or spikes that can overwhelm an underprovisioned Kibana instance. In production environments, an overwhelmed Kibana instance will typically return 502 or 503 error responses.

Load balancing helps to mitigate traffic spikes by horizontally scaling your Kibana deployments and improving Kibana's availability. To learn more about load balancing, refer to High Availability and load balancing in Kibana.

Before you start

Elasticsearch is the search engine and backing database of Kibana. Any performance issues in Elasticsearch will manifest in Kibana. Additionally, while Elastic tries to mitigate this possibility, Kibana may be sending requests to Elasticsearch that degrade performance if Elasticsearch is underprovisioned.

Is the Elasticsearch cluster correctly sized?

Follow the production guidance for Elasticsearch.

What requests is Kibana sending to Elasticsearch?

In user interfaces like Dashboards or Discover, you can view the full query that Kibana is sending to Elasticsearch. This is a good way to get an idea of the volume of data and work a Kibana visualization or dashboard is creating for Elasticsearch. Dashboards with many visualizations will generate higher load for Elasticsearch and Kibana.

Basic scaling using number of concurrent users

Follow this strategy if you know the maximum number of expected concurrent users.

Start Kibana on 1 vCPU and 2GB of memory. This should comfortably serve a set of 10 concurrent users performing analytics activities like browsing dashboards.

If you are experiencing performance issues, you can scale Kibana vertically by adding the following resources for every 10 additional concurrent users:

1 vCPU
2GB of memory

These amounts are a safe minimum to ensure that Kibana is not resource-starved for common analytics use cases.

It is recommended to scale vertically to a maximum of 8.4 vCPU and 8GB of memory.

You should also combine vertical scaling with horizontal scaling to handle greater concurrency or bursty traffic. Refer to High Availability and load balancing in Kibana for guidance.

Scaling examples

Concurrent users	Minimum vCPU	Minimum memory	ECH and ECE deployment examples
50	5 vCPU	10GB	• Kibana size per zone of 16GB RAM and 8 vCPU in 1 availability zone (creates 2 x 8GB nodes) • Kibana size per zone of 8GB RAM and up to 8 vCPU across 2 availability zones • Kibana size per zone of 4GB RAM and up to 8 vCPU across 3 availability zones
100	10 vCPU	20GB	• Kibana size per zone of 24 GB RAM and 12 vCPU in 1 availability zone (creates 3 x 8GB nodes) • Kibana size per zone of 8GB RAM and up to 8 vCPU across 3 availability zones

Refer to the guidance on adjusting Kibana's allocated resources once you have determined sizing.

Advanced scaling using stack monitoring metrics

Building on the simple strategy outlined above, we can identify where Kibana is resource constrained more precisely. Self-managed and Elastic Cloud on Kubernetes users manage CPU and memory allocations independently and can further tailor resources based on performance metrics.

Gather usage information

In order to understand the impact of your usage patterns on a single Kibana instance, use the stack monitoring feature.

Using stack monitoring, you can gather the following metrics for your Kibana instance:

Event loop delay (ELD) in milliseconds: A Node.js concept that roughly translates to the number of milliseconds by which processing of events is delayed due to CPU-intensive activities.
Heap size in bytes: The amount of bytes currently held in memory dedicated to Kibana's heap space.
HTTP connections: The number of sockets that the Kibana server has open.

Scale CPU using ELD metrics

Event loop delay (ELD) is an important metric for understanding whether Kibana is engaged in CPU-bound activity.

As a general target, ELD should be below ~220ms 95% of the time. Higher delays may mean Kibana is CPU-starved. Sporadic increases above 200ms may mean that Kibana is periodically processing CPU-intensive activities like large responses from Elasticsearch, whereas consistently high ELD may mean Kibana is struggling to service tasks and requests.

Before increasing CPU resources, consider the impact of ELD on user experience. If users are able to use Kibana without the frustration that comes from a blocked CPU, provisioning additional CPU resources will not be impactful, although having spare resources in case of unexpected spikes is useful.

Monitoring Kibana's ELD over time is a solid strategy for knowing when additional CPU resource is needed based on your usage patterns.

Refer to the guidance on adjusting Kibana's allocated resources once you have determined vCPU sizing.

Scale memory using heap size metrics

Heap size is an important metric to track. If Kibana's heap size grows beyond the heap limit, Kibana will crash. By monitoring heap size, you can help ensure that Kibana has enough memory available.

Self-managed users must provision memory to the host that Kibana is running on as well as configure allocated heap. See the guidance on configuring Kibana memory.

Refer to the guidance on adjusting Kibana's allocated resources once you have determined memory sizing.

Adjust resource allocations for Kibana

The way that you alter the resources allocated to your Kibana instance depends on your deployment type:

Elastic Cloud Hosted and Elastic Cloud Enterprise: Users can adjust Kibana's memory by viewing their deployment and editing the Kibana instance's resource configuration. In these environments, size increments are predetermined.
Elastic Cloud on Kubernetes: Users can configure pod memory and CPU resources. Refer to Manage compute resources.
Self-managed: Users must provision memory to the host that Kibana is running on as well as configure allocated heap. See the guidance on configuring Kibana memory.

Note

For Elastic Cloud on Kubernetes and self-managed deployments, Node.js suggests allocating 80% of available host memory to heap, assuming that Kibana is the only server process running on the (virtual) host. This allows for memory resources to be used for other activities, for example, allowing for HTTP sockets to be allocated.