Guide
An AI agent on an embedded system is a firmware architecture that combines sensor input, ML inference, decision logic, and actuator control into an autonomous loop. Unlike single-model inference, agents coordinate multiple inputs and models to act on their environment without human intervention.
Published 2026-04-01
The term “AI agent” in 2024-2025 typically refers to LLM-based systems that use tools, maintain memory, and pursue goals — ChatGPT with function calling, autonomous coding assistants, multi-step reasoning systems. That model does not translate to microcontrollers. MCUs have kilobytes of RAM, not gigabytes. They do not run language models.
On a microcontroller, an AI agent is something different and more specific:
An autonomous sense-think-act loop that runs continuously on the device, making decisions about the physical world based on ML inference — without human input and without cloud connectivity.
The “intelligence” comes not from a foundation model, but from the combination of:
This is not a stripped-down version of cloud-native AI agents. It is a fundamentally different architecture — closer to robotics than to chatbots.
Most embedded ML projects stop at inference. Building an agent means continuing through additional layers:
Sensor → Model → Prediction (displayed or logged)
The model runs, produces output, and someone else decides what to do. This is the “serial monitor demo” — useful for validation, not for deployment.
Sensor → Model → Decision Logic → Action
The firmware acts on the prediction. If the anomaly score exceeds a threshold, toggle a GPIO, send an MQTT message, or activate a relay. Most production edge AI deployments today are reactive systems.
Sensor → Model → State Machine → Conditional Actions → Feedback → Sensor
The system maintains state across inference cycles. It remembers that the anomaly score has been rising for 4 hours. It escalates from “advisory” to “warning” to “critical” based on trend, not just the current reading. Past actions influence future decisions.
Agent A ←→ Coordination Protocol ←→ Agent B
↕
Agent C
Multiple agents on separate MCUs collaborate. A vibration monitoring agent on Motor 1 shares its state with an agent on Motor 2. If both flag anomalies simultaneously, the coordination logic infers a systemic cause (power supply issue, ambient temperature spike) rather than two independent failures.
A single agent on an MCU consists of four subsystems:
The sensing layer abstracts hardware inputs into a uniform data stream:
typedef struct {
float vibration_rms;
float temperature_c;
float current_a;
uint32_t timestamp_ms;
} sensor_reading_t;
The sensor task runs on a fixed schedule — typically 10 Hz to 1 kHz depending on the modality. It reads raw ADC or I2C data, applies calibration, and writes to a shared ring buffer.
For agents that combine multiple sensors (vibration + temperature + current for predictive maintenance), the sensing layer handles synchronization — ensuring inference operates on temporally aligned data from all sources.
The inference and decision layer runs ML models and applies logic:
agent_decision_t agent_think(sensor_reading_t* readings, int count) {
// Preprocess: compute FFT spectrum from vibration data
float spectrum[128];
compute_fft(readings, spectrum, count);
// Run anomaly detection model
float anomaly_score = run_anomaly_model(spectrum);
// Stateful logic: track score trend over time
update_score_history(anomaly_score);
float trend = compute_trend();
// Decision: combine current score with trend
if (anomaly_score > CRITICAL_THRESHOLD) {
return DECISION_CRITICAL;
} else if (anomaly_score > WARNING_THRESHOLD && trend > 0) {
return DECISION_WARNING;
}
return DECISION_NORMAL;
}
The key difference from plain inference: the decision logic is not just a threshold on the model output. It incorporates state (score history), trends (is it getting worse?), and cross-references (vibration anomaly + temperature rise = different conclusion than vibration anomaly alone).
The action layer translates decisions into physical outputs:
| Decision | Action | Hardware |
|---|---|---|
| NORMAL | Update dashboard periodically | MQTT publish via Wi-Fi |
| WARNING | Send alert, increase sampling rate | MQTT + timer reconfiguration |
| CRITICAL | Trigger local alarm, notify, log | GPIO relay + MQTT + flash log |
Actions can also modify the agent’s own behavior:
On-device learning on MCUs is constrained. Full model retraining requires backpropagation and optimizer state — operations that exceed typical MCU memory.
What is practical today:
Full model retraining happens off-device — on a PC or in the cloud. The retrained model is deployed via firmware update.
A concrete agent architecture for vibration-based machine monitoring:
Hardware: ESP32 + MEMS accelerometer + MQTT broker
Agent state machine:
MONITORING → (anomaly score > 0.7) → ALERT
ALERT → (score < 0.5 for 1 hour) → MONITORING
ALERT → (score > 0.9 OR rising trend > 2 hours) → CRITICAL
CRITICAL → (maintenance acknowledged via MQTT) → MONITORING
Behavior per state:
| State | Sample Rate | Inference Interval | Action |
|---|---|---|---|
| MONITORING | 100 Hz | Every 10 s | Periodic status via MQTT |
| ALERT | 500 Hz | Every 2 s | Alert via MQTT, LED warning |
| CRITICAL | 1 kHz | Every 500 ms | Alarm relay, continuous MQTT, raw data log |
This is more than a model running in a loop. The agent adapts its behavior based on what it observes, escalates through defined stages, and takes different physical actions at each stage.
A more complex agent that combines multiple sensing modalities:
Hardware: STM32H7 + accelerometer + thermocouple + current transformer
Pipeline:
The fusion rules encode engineering knowledge:
The fusion logic is where agent intelligence lives — not in any individual model, but in how models are combined with domain knowledge.
This field is early. There is no standard framework for building AI agents on MCUs. The patterns described here are architectural — implemented in custom C firmware, not with an off-the-shelf agent SDK. Standardized tooling for orchestrating agent workflows on microcontrollers is emerging but not yet mature.
Memory is the hard constraint. Each additional model, state buffer, and sensor pipeline consumes SRAM that does not grow. A three-model agent on ESP32 (520 KB SRAM) leaves minimal headroom for application logic. Memory planning must be done upfront.
Testing is difficult. Agent behavior depends on state transitions that may take hours or days to trigger in real conditions. Simulation and accelerated testing frameworks for embedded AI agents are underdeveloped compared to cloud-native testing tools.
Debugging is harder than inference. When a single model produces wrong output, you check the input data and model weights. When an agent makes a wrong decision, you must trace through sensor fusion, state machine transitions, threshold logic, and action dispatch. Embedded debuggers help, but there is no equivalent of cloud-native observability for MCU agents.
The embedded AI agent space is converging from two directions:
From the embedded side: RTOS vendors and MCU manufacturers are adding ML-aware task scheduling, hardware inference accelerators, and inter-device communication protocols. FreeRTOS on ESP32 already provides the concurrency primitives. ST’s Cube.AI provides model optimization. The missing piece is the agent coordination layer that connects them.
From the AI side: The agent paradigm — sense, think, act, learn — is being applied to constrained devices. The question is how much of the orchestration layer can be abstracted without sacrificing the control that embedded developers need.
The intersection — AI agents that run autonomously on $5 microcontrollers, coordinate with each other, and adapt to their environment — is where embedded development is heading. The building blocks (ML inference, RTOS scheduling, wireless communication) exist today. The integration tooling that ties them into coherent agent architectures is what is being built now.
Run object detection on ESP32-S3 with TFLite Micro. Hardware specs, compatibility analysis, getting started guide, and alternatives.
Deploy vibration-based predictive maintenance on ESP32 with Edge Impulse. Sensor setup, model training, and continuous monitoring guide.
Run anomaly detection on STM32F4 with TFLite Micro. Autoencoder-based monitoring on the industry-standard Cortex-M4 platform.
Deploy predictive maintenance on STM32H7 with Edge Impulse. High-frequency vibration analysis with 1 MB SRAM and 480 MHz Cortex-M7.
Run anomaly detection on ESP32 with TFLite Micro. Autoencoder setup, sensor integration, and real-time monitoring for industrial applications.
ForestHub is an edge AI agents orchestration platform. Design sensor-inference-action workflows visually, generate deployable firmware for ESP32 and STM32.
Get Started Free