IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« Fleet and Elastic Agent 8.18.8 Fleet and Elastic Agent 8.18.6 »

› ›

Fleet and Elastic Agent 8.18.7

edit

A newer version is available. Check out the latest documentation.

Fleet and Elastic Agent 8.18.7

edit

Known issues

edit

Failed upgrades leave Elastic Agent stuck until restart

This known issue applies to Elastic Agent 8.18.7 and 9.0.7. Elastic Agent versions 8.19.x and 9.1.x are not affected.

On September 17, 2025, a known issue was discovered that can cause Elastic Agent upgrades to get stuck if an upgrade attempt fails under specific conditions. This happens because the coordinator’s overrideState remains set, leaving the agent in a state that appears to be upgrading.

Conditions

This issue is triggered if the upgrade fails during one of the early checks inside Coordinator.Upgrade, for example:

The agent is not upgradeable
Capabilities check denies the upgrade
When Elastic Agent is tamper-protected, Endpoint must validate that the upgrade action was correctly signed by Kibana to allow the upgrade. If the signature is missing, invalid, or the connection between Elastic Agent and Endpoint was interrupted, the validation fails. This causes the agent coordinator’s override state to become stuck until the agent is restarted.

Symptoms

Fleet shows the upgrade action in progress, even though the upgrade remains stuck
No further upgrade attempts succeed
Elastic Agent status shows an override state indicating upgrade

Workaround

Restart the Elastic Agent to clear the coordinator’s overrideState and allow new upgrade attempts to proceed.

Resolution

This issue was fixed in #9992, which ensures that the coordinator clears its override state whenever an early failure occurs.

The fix is included in versions 9.1.4 and 8.19.4, and planned for versions 9.0.8 and 8.18.8.

fleet-agents template is missing mappings

Details

On May 2, 2025 a known issue was discovered that the .fleet-agents index template was missing a mapping for the local_metadata.complete attribute. This may cause agent checkins to be rejected and the agents to appear as offline.

In this Fleet’s logs this will appear as:

elastic fail 400: document_parsing_exception: [1:209] object mapping for [local_metadata] tried to parse field [local_metadata] as object, but found a concrete value
Eat bulk checkin error; Keep on truckin'

And in the Elastic Agent logs it will appear as:

"log.level":"error","@timestamp":"2025-04-22:12:35:25.295Z","message":"Eat bulk checkin error; Keep on truckin'","component":{"binary":"fleet-server","dataset":"elastic_agent.fleet_server","id":"fleet-server-es-containerhost","type":"fleet-server"},"log":{"source":"fleet-server-es-containerhost"},"service.type":"fleet-server","error.message":"elastic fail 400: document_parsing_exception: [1:209] object mapping for [local_metadata] tried to parse field [local_metadata] as object, but found a concrete value","ecs.version":"1.6.0","service.name":"fleet-server","ecs.version":"1.6.0"

This attribute was added to the template in versions: 8.17.11 8.18.3, and 8.19.3.

Further investigation revealed that the .fleet-agents index template was not correctly applied due to an unchanged _meta.managed_index_mappings_version number. This change also affects other attributes as well, such as upgrade_attempts, namespaces, unprivileged, and unhealthy_reason. If there is an error related to any of these attributes, there will be a similar error message in the logs.

Impact

Updating to a version with a fixed _meta.managed_index_mappings_version will correctly apply the new index template. The fixed versions are 8.18.8, 8.19.4, 9.0.8, 9.1.4.

New features and enhancements

edit

Elastic Agent

Bump kube-stack Helm Chart to 0.9.1 and enable the cluster collector. #9535
Enhanced loggers for easier debugging of upgrade related issues. #9536

Bug fixes

edit

Elastic Agent

Redact secrets from pre-config, computed-config, components-expected, and components-actual files in diagnostics archive. #9560
Retry service start command upon failure with 30-second delay. #9313
Fix reporting of scheduled upgrade details across restarts and cancels. #9562 #8778
Enable root user to re-enroll unprivileged agent for mac and linux. #9603 #8544
Fix missing liveness healthcheck during container enrollment. #9612 #9611
Enable admin user to re-enroll unprivileged agent for windows. #9623 #8544
Treat exit code 284 from Endpoint binary as non-fatal. #9687
Ensure failed upgrade actions are removed from queue and details are set. #9634 #9629

Fleet Server

Restore connection limiter. #5372

Restore connection level limiter to prevent OOM incidents. This limiter is used in addition to the request-level throttle so that once our in-flight requests reaches max_connections a 429 is returned, but if the total connections the server uses is over max_connections*1.1 the server drops the connection before the TLS handshake.
Build fleet-server as fully static binary to restore OS matrix compatibility. #5392 #5262

« Fleet and Elastic Agent 8.18.8 Fleet and Elastic Agent 8.18.6 »