Missing or incomplete traces due to Collector sampling
Serverless
If traces or spans are missing in Kibana, the issue might be related to the Collector’s sampling configuration.
Stack Tail-based sampling (TBS) allows the Collector to evaluate entire traces before deciding whether to keep them. If TBS policies are too strict or not aligned with your workloads, traces you expect to see may be dropped.
Both Collector-based and SDK-level sampling can lead to gaps in telemetry if not configured correctly. See Missing or incomplete traces due to SDK sampling for more information.
When Collector-based tail sampling is misconfigured or too restrictive, you might observe the following:
- Only a small subset of traces reaches Elasticsearch/Kibana, even though SDKs are exporting spans.
- Error traces are missing because they’re not explicitly included in the
sampling_policy
. - Collector logs show dropped spans.
The following conditions can lead to missing or incomplete traces when using tail-based sampling in the Collector:
- Tail sampling policies in the Collector are too narrow or restrictive.
- The default rule set excludes key transaction types (for example long-running requests, non-error transactions).
- Differences between head sampling (SDK) and tail sampling (Collector) can lead to fewer traces being available for evaluation.
- Conflicting or overlapping
sampling_policy
rules might result in unexpected drops. - High load: the Collector might drop traces if it can’t evaluate policies fast enough.
Follow these steps to resolve sampling configuration issues:
-
Review
sampling_policy
configuration- Check the
processor/tailsampling
section of your Collector configuration - Ensure policies are broad enough to capture the traces you need
- Check the
-
Add explicit rules for critical traces
- Create specific rules for important trace types
- Example: keep all error traces, 100% of login requests, and 10% of everything else
- Use attributes like
status_code
,operation
, orservice.name
to fine-tune rules
-
Validate Collector logs
- Review Collector logs for messages about dropped spans, and determine whether drops are due to sampling policy outcomes or resource limits
-
Differentiate head and tail sampling
- Review if SDKs already applies head sampling, which reduces traces available for tail sampling in the Collector
- Consider setting SDKs to
always_on
and managing sampling centrally in the Collector for more flexibility
-
Test in staging
- Adjust sampling policies incrementally in a staging environment
- Monitor trace volume before and after changes
- Validate that critical traces are captured as expected