Rules engine
The rules engine evaluates a CEL expression against every validated telemetry point and dispatches actions when it matches. It is the lightweight, per-point reactive layer: simple condition-to-action logic that runs in under 5ms. For multi-step graph processing — enrichment, branching, connectors, dead-letter handling — see Rule chains.
A rule is a condition (a CEL expression), a device group it applies to, and one or more actions to fire when the condition is true.
How evaluation works
Section titled “How evaluation works”The engine consumes the same enriched, validated telemetry stream your devices produce when sending telemetry, and evaluates matching rules against each point. Evaluation must complete within 5ms per point.
flowchart TD
point([Validated point<br/>telemetry.validated.T1]) --> scope{Device in the<br/>rule's group?}
scope -->|no| skip([Rule not evaluated])
scope -->|yes| eval{CEL condition true?}
eval -->|no| none([No action])
eval -->|yes| act["Dispatch actions"]
act --> event["Publish rule.triggered.T1.{rule}"]
act --> metric["Increment rules_triggered_total"]
CEL expression variable scope
Section titled “CEL expression variable scope”Conditions are written in CEL (Common Expression Language). Only the variables below are in scope; referencing anything else is a compile-time error, so a broken rule can never be saved.
| Variable | Type | Description |
|---|---|---|
data | map(string, double) | Telemetry numeric values, e.g. data.temperature |
data_str | map(string, string) | Telemetry string values, e.g. data_str.firmware_status |
device.id | string | Device ID |
device.name | string | Display name |
device.groups | list(string) | Group membership |
device.tags | map(string, string) | Tag key-value pairs |
device.firmware | string | Firmware version |
device.status | string | online / offline |
device.last_seen | int | Unix timestamp (seconds) of last telemetry |
timestamp | int | Unix timestamp (seconds) of the current point |
tenant_id | string | Tenant ID |
Compile-time validation
Section titled “Compile-time validation”When a rule is created or updated, the CEL expression is compiled and validated before storage. Two classes of error are caught up front:
- Syntax errors. A malformed expression like
data.temperature >>is rejected withINVALID_ARGUMENTand a human-readable parse error. - Undeclared references. Using a variable that isn’t in scope — e.g.
data.temperature > threshold— fails with “undeclared reference to ‘threshold’” andINVALID_ARGUMENT.
This is why rules are fast at runtime: by the time a rule is stored it is already a validated, compiled artifact, not a string to be parsed per point.
Group scoping
Section titled “Group scoping”Every rule targets a device group. A point is only evaluated against rules whose
target group contains the originating device. A rule for group furnace is never
evaluated for a device in group hvac — the engine skips it entirely rather than
evaluating-then-discarding. This keeps per-point work proportional to the rules that
actually apply.
Example rules
Section titled “Example rules”Alert when any furnace device exceeds 80 °C:
data.temperature > 80Target group: furnace. Action: webhook to your alerting endpoint.
Only fire for EU-region devices over threshold — combining numeric data with a device tag:
data.temperature > 80 && device.tags["region"] == "eu"React to a string-valued metric, e.g. firmware reporting it needs an update:
data_str.firmware_status == "update_required"Combine multiple metrics and device state:
data.temperature > 75 && data.pressure > 4.0 && device.status == "online"Action types
Section titled “Action types”A rule fires one or more of these actions when its condition is true:
| Action | Behavior |
|---|---|
| Webhook | HTTP POST carrying the telemetry point plus rule metadata. Retried 3 times with exponential backoff; on final failure it is logged with trace context and rule_action_failures_total{type="webhook"} is incremented. |
| MQTT command | Publishes a command payload to the EMQX topic devices/{device_id}/commands — e.g. {"set_fan": "high"}. |
| Sends a notification via SMTP. | |
| Redpanda event | Publishes an event to a Redpanda subject for downstream consumers. |
The rule.triggered event and metrics
Section titled “The rule.triggered event and metrics”Every time a rule fires, in addition to its actions, the engine:
- Publishes an event to
rule.triggered.{tenant_id}.{rule_id}carryingrule_id,device_id, and the trigger values. This is what powers live rule-trigger feeds and the per-rule trigger history in the UI. - Increments
rules_triggered_total{tenant="T1",rule="R1"}.
If a rule does not match, no action is dispatched and no event is published.
The compiled-rule cache
Section titled “The compiled-rule cache”Compiled rules are cached in Aerospike namespace rules, set compiled, keyed by
{tenant_id}:{rule_id}, with a 600-second (10-minute) TTL. Evaluation reads the
ready-to-run compiled form from this cache rather than recompiling per point. A
practical consequence: after you edit or disable a rule, the change is fully in
effect within the cache TTL window. Disabling a rule sets its enabled flag to
false so subsequent evaluations skip it.
Conflict detection (advisory)
Section titled “Conflict detection (advisory)”When you create or update a rule, the engine checks for potentially conflicting
rules — for example, two rules on the same device group with overlapping conditions.
Conflicts are advisory, not blocking: the rule is saved and the response
includes a warnings array noting the overlap.
When to reach for rule chains instead
Section titled “When to reach for rule chains instead”Use the rules engine for simple, stateless condition → action logic. Reach for rule chains when you need to enrich a message with asset or related-entity data, branch on multiple conditions, gate on blockchain finality, fan out to external connectors with retries and a dead-letter queue, or create and manage alarms as first-class entities.