Skip to content

Commit ddc5592

Browse files
[Internal][Security Solution][Detection Engine]: auto inject metadata _id in rule ES|QL query (#5800)
<!-- Thank you for contributing to the Elastic Docs! 🎉 Use this template to help us efficiently review your contribution. --> ## Summary <!-- Describe what your PR changes or improves. If your PR fixes an issue, link it here. If your PR does not fix an issue, describe the reason you are making the change. --> Fixes #5462 by documenting automatic `METADATA _id` handling for non-aggregating ES|QL detection rules. ### Previews - [Alert deduplication and _id metadata](https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/5800/solutions/security/detect-and-alert/esql#esql-alert-deduplication): Added that in 9.4+, users don’t need `METADATA _id` in the editor for deduplication, the saved query stays as typed (including pastes from Discover or AI tools), and that `missing _id` in results surfaces a non-blocking editor warning with Save with errors and possible duplicate alerts until the query is fixed. - [Non-aggregating example](https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/5800/solutions/security/detect-and-alert/esql#esql-example-non-aggregating) - Updated so the sample rule query omits `METADATA _id` in FROM - [Troubleshoot detection rules | ESQL rules](https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/5800/troubleshoot/security/detection-rules#esql-rules-ts) - Updated table description to describe the auto-injection of `METADATA _id`. ## Generative AI disclosure <!-- To help us ensure compliance with the Elastic open source and documentation guidelines, please answer the following: --> 1. Did you use a generative AI (GenAI) tool to assist in creating this contribution? - [x] Yes - [ ] No <!-- 2. If you answered "Yes" to the previous question, please specify the tool(s) and model(s) used (e.g., Google Gemini, OpenAI ChatGPT-4, etc.). Tool(s) and model(s) used: --> Cursor + Composer
1 parent a26e571 commit ddc5592

5 files changed

Lines changed: 44 additions & 5 deletions

File tree

solutions/security/detect-and-alert/custom-query.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ description: Create rules using KQL or Lucene queries to detect known field valu
1111

1212
# Custom query rules [custom-query-rule-type]
1313

14+
## Overview
15+
1416
Custom query rules search your {{es}} indices using a KQL or Lucene query and generate an alert whenever one or more documents match.
1517

1618
### When to use a custom query rule

solutions/security/detect-and-alert/eql.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ description: Create detection rules using Event Query Language (EQL) to detect e
1111

1212
# Event correlation (EQL) rules [eql-rule-type]
1313

14+
## Overview
15+
1416
Event correlation rules use [Event Query Language (EQL)](elasticsearch://reference/query-languages/eql/eql-syntax.md) to detect ordered sequences of events, single events with complex conditions, or the absence of expected events. EQL is purpose-built for event-based data and excels at expressing time-ordered relationships that other query languages cannot.
1517

1618
### When to use an EQL rule

solutions/security/detect-and-alert/esql.md

Lines changed: 27 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ description: Create detection rules using Elasticsearch Query Language (ESQL) wi
1111

1212
# {{esql}} rules [esql-rule-type]
1313

14+
## Overview
15+
1416
{{esql}} rules use [{{es}} Query Language ({{esql}})](elasticsearch://reference/query-languages/esql.md) to query source events and aggregate or transform data using a pipeline syntax. Query results are returned as a table where each row becomes an alert. {{esql}} rules combine the flexibility of a full query pipeline with the detection capabilities of {{elastic-sec}}.
1517

1618
### When to use an {{esql}} rule
@@ -31,6 +33,26 @@ description: Create detection rules using Elasticsearch Query Language (ESQL) wi
3133

3234
{{esql}} rules query {{es}} indices directly using the `FROM` command. The indices must be accessible to the user who creates or last edits the rule.
3335

36+
### Alert deduplication and `_id` metadata [esql-alert-deduplication]
37+
38+
For **non-aggregating** queries (queries that do not use `STATS...BY`), the detection engine relies on the document `_id` metadata field to avoid creating duplicate alerts for the same source event across rule executions.
39+
40+
::::{tab-set}
41+
:::{tab-item} {{stack}} 9.4+
42+
You don't need to add `METADATA _id` in the rule query for deduplication. The query in the editor is saved exactly as you enter it. You can paste from Discover or from AI-assisted tools without adding metadata clauses.
43+
44+
If `_id` will be missing from the query results (for example, because of `DROP _id`, `RENAME _id AS …`, or `EVAL _id = …`), the query editor shows a non-blocking warning that you might get duplicate alerts. You can still save the rule after confirming the **Save with errors** dialog, but might get duplicate alerts until you adjust the query.
45+
46+
:::
47+
:::{tab-item} {{stack}} 9.0-9.3
48+
49+
You must add `METADATA _id` to the `FROM` command yourself for non-aggregating queries if you want deduplication across executions. Without it, the same source event can generate duplicate alerts.
50+
51+
:::
52+
::::
53+
54+
You can still include `METADATA` explicitly—for example `METADATA _id, _index, _version`—when you need additional [metadata fields](elasticsearch://reference/query-languages/esql/esql-metadata-fields.md) in the query or for clarity.
55+
3456
<!-- CRAFT LAYER - COMMENTED OUT FOR REVIEW
3557
## Writing effective {{esql}} queries [craft-esql]
3658
@@ -134,15 +156,15 @@ This rule counts failed login attempts per user and alerts when any user exceeds
134156

135157
### Non-aggregating query with deduplication [esql-example-non-aggregating]
136158

137-
This rule detects process-start events with suspicious encoded arguments and uses `METADATA` to enable alert deduplication across rule executions.
159+
This rule detects process-start events with suspicious encoded arguments. The query omits `METADATA _id` in the `FROM` clause; in {{stack}} 9.4 and later, the detection engine injects `METADATA _id` at execution time for alert deduplication. On {{stack}} 9.3 and earlier, add `METADATA _id` (and optionally `_index`, `_version`) to the `FROM` command yourself.
138160

139161
```json
140162
{
141163
"type": "esql",
142164
"language": "esql",
143165
"name": "Process execution with encoded arguments",
144166
"description": "Detects process start events where the command line contains encoded content.",
145-
"query": "FROM logs-endpoint.events.* METADATA _id, _index, _version | WHERE event.category == \"process\" AND event.type == \"start\" AND process.command_line LIKE \"*-encoded*\" | LIMIT 100",
167+
"query": "FROM logs-endpoint.events.* | WHERE event.category == \"process\" AND event.type == \"start\" AND process.command_line LIKE \"*-encoded*\" | LIMIT 100",
146168
"severity": "medium",
147169
"risk_score": 47,
148170
"interval": "5m",
@@ -152,15 +174,15 @@ This rule detects process-start events with suspicious encoded arguments and use
152174

153175
| Field | Value | Purpose |
154176
|---|---|---|
155-
| `query` | `FROM ... METADATA _id, _index, _version \| WHERE ... \| LIMIT 100` | A non-aggregating query. `METADATA _id, _index, _version` after `FROM` enables alert deduplication so the same source event does not generate duplicate alerts across rule executions. Without it, repeated matches produce repeated alerts. |
156-
| `LIMIT` | `100` | Caps the number of results per execution. Interacts with the **Max alerts per run** setting, and the rule uses the lower of the two values. |
177+
| `query` | `FROM ... \| WHERE ... \| LIMIT 100` | A non-aggregating query. Each matching row becomes an alert. <br><br> {applies_to}`stack: ga 9.4+` For deduplication across executions, `METADATA _id` is automatically added if missing, but you must ensure that `_id` appears in the execution results. Commands that restrict or remove fields (such as `DROP _id` or `KEEP agent.*` which retains only `agent.*` fields) will exclude `_id` from results and prevent deduplication. <br><br> {applies_to}`stack: ga 9.0-9.3` In earlier versions, include `METADATA _id` (and optionally other metadata fields) after `FROM`. |
178+
| `LIMIT` | `100` | Limits the number of results per execution. Interacts with the **Max alerts per run** setting, and the rule uses the lower of the two values. |
157179

158180
## {{esql}} rule field reference [esql-fields]
159181

160182
The following settings appear in the **Define rule** section when creating an {{esql}} rule. For settings shared across all rule types, refer to [Rule settings reference](/solutions/security/detect-and-alert/common-rule-settings.md).
161183

162184
**{{esql}} query**
163-
: The [{{esql}} query](elasticsearch://reference/query-languages/esql.md) that defines the detection logic. Can be aggregating (with `STATS...BY`) or non-aggregating. Each row in the query result becomes an alert.
185+
: The [{{esql}} query](elasticsearch://reference/query-languages/esql.md) that defines the detection logic. Can be aggregating (with `STATS...BY`) or non-aggregating. Each row in the query result becomes an alert. For non-aggregating queries, validation may show a non-blocking warning if `_id` is absent from the results (for example after `DROP _id`); you can still save the rule after confirming **Save with errors**.
164186

165187
**Suppress alerts by** (optional)
166188
: Reduce repeated or duplicate alerts by grouping them on one or more fields. For details, refer to [Alert suppression](/solutions/security/detect-and-alert/alert-suppression.md).

solutions/security/detect-and-alert/threshold.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ description: Create threshold rules to alert when the number of matching events
1111

1212
# Threshold rules [threshold-rule-type]
1313

14+
## Overview
15+
1416
Threshold rules search your {{es}} indices and generate an alert when the number of events matching a query meets or exceeds a specified threshold within a single rule execution. Optionally, events can be grouped by one or more fields so that each unique combination is evaluated independently.
1517

1618
### When to use a threshold rule

troubleshoot/security/detection-rules.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,17 @@ In the following example, the selected field is unmapped across two indices.
156156
:::::
157157

158158

159+
## {{esql}} rules [esql-rules-ts]
160+
161+
::::{dropdown} Non-aggregating {{esql}} rule: validation warning about `_id` or duplicate alerts
162+
:name: esql-missing-id-ts
163+
164+
For **non-aggregating** {{esql}} detection rules, the engine uses document `_id` to avoid generating duplicate alerts for the same source event.
165+
166+
{applies_to}`stack: ga 9.4+` The detection engine automatically adds `METADATA _id` to the `FROM` command when your query omits it, and makes a best-effort attempt to keep `_id` available in the pipeline (for example by appending `_id` to restrictive `KEEP` commands). Refer to [](../../solutions/security/detect-and-alert/esql.md#esql-alert-deduplication) to learn more.
167+
::::
168+
169+
159170
## Rule executions
160171

161172

0 commit comments

Comments
 (0)