Guide
The ESP32-S3 is the strongest general-purpose choice for machine learning among popular MCU families. It combines 512 KB SRAM, 8 MB PSRAM, SIMD vector instructions, and a camera interface at $10-25 per dev board — one of the strongest price-to-ML-capability ratios among mainstream MCU platforms.
Published 2026-04-01
Not every microcontroller can run machine learning. The minimum requirements are:
Below that threshold, inference either does not run or is too slow for practical use.
| Spec | ESP32-S3 | STM32H7 | ESP32 | ESP32-C3 | STM32F4 | Arduino Nano 33 BLE |
|---|---|---|---|---|---|---|
| Processor | Xtensa LX7 (dual) | Cortex-M7 | Xtensa LX6 (dual) | RISC-V (single) | Cortex-M4F | Cortex-M4F |
| Clock | 240 MHz | 480 MHz | 240 MHz | 160 MHz | 168 MHz | 64 MHz |
| SRAM | 512 KB | 1024 KB | 520 KB | 400 KB | 192 KB | 256 KB |
| Flash | Up to 16 MB | 2 MB | Up to 16 MB | Up to 16 MB | 1 MB | 1 MB |
| ML Accelerator | SIMD vector | FPU + cache | None | None | FPU | FPU |
| Camera Interface | Yes | Yes (DCMI parallel interface) | No | No | No | No |
| Wi-Fi | Yes | No (Ethernet) | Yes | Yes | No | No |
| Dev Board Price | $10-25 | $30-80 | $5-15 | $4-10 | $10-30 | $20-35 |
The ESP32-S3 for object detection is the strongest option for camera-based ML on an MCU. The combination of a hardware camera interface, SIMD vector instructions that accelerate quantized operations, and up to 8 MB PSRAM for image buffers makes it the only sub-$25 board that handles image classification practically.
The STM32H7 has a DCMI parallel camera interface with DMA support, but its 1 MB SRAM is more constrained for large image buffers compared to the ESP32-S3’s external PSRAM. The ESP32-S3 remains the most cost-effective option combining camera, Wi-Fi, and ML acceleration.
The STM32H7 runs at 480 MHz with 1 MB SRAM, L1 cache, and a double-precision FPU. For models that need raw throughput — large anomaly detection, complex signal processing, or audio with high sample rates — the H7 is one of the fastest Cortex-M7 options available.
The trade-off: no Wi-Fi (Ethernet only), more expensive ($30-80 per board), and a steeper development setup through STM32CubeIDE.
At $1-3 per chip, the ESP32-C3 is the cheapest ML-capable MCU. Its 400 KB SRAM and 160 MHz RISC-V core handle anomaly detection and simple classification. Add Wi-Fi and BLE for IoT connectivity.
The single core and missing SIMD limit it to simpler models. Do not use it for vision or complex audio processing.
The STM32L4 for anomaly detection draws under 100 nA in shutdown mode. For battery-powered applications — asset tracking, environmental monitoring, wearables — the L4 runs small models on its 128 KB SRAM while sipping power.
The 80 MHz clock and 1 MB flash limit it to models under 200 KB. Sufficient for anomaly detection and simple classification, not for vision or complex audio.
The Nano 33 BLE is the easiest on-ramp. The Sense variant includes a 9-axis IMU, microphone, and gesture/color/proximity sensors. Combined with the Arduino IDE and Edge Impulse’s direct integration, you can go from zero to running inference in an afternoon.
The 256 KB RAM and 64 MHz clock limit it to smaller models. The built-in sensors make it ideal for gesture recognition, keyword spotting, and sensor-based anomaly detection without any external wiring.
If your application needs Wi-Fi or BLE for data transmission, your choices narrow significantly:
For IoT applications that send inference results to a gateway or cloud dashboard, the ESP32 family is the default choice. The STM32 family is better for standalone or wired industrial applications.
Both TensorFlow Lite Micro and Edge Impulse support all MCUs listed here. The practical differences:
TFLite Micro gives you full control. You manage the model conversion, operator registration, and memory allocation. Better for teams with embedded experience who need to optimize every byte.
Edge Impulse handles the full pipeline — data collection, training, quantization, and export as a ready-to-compile C++ library. Better for rapid prototyping and teams without deep ML expertise. The trade-off is a larger binary (the SDK includes its own inference pipeline) and cloud dependency for training.
| Priority | Choose | Why |
|---|---|---|
| Best all-round ML | ESP32-S3 | SIMD, camera, PSRAM, Wi-Fi, $10-25 |
| Max performance | STM32H7 | 480 MHz, 1 MB SRAM, FPU + cache |
| Lowest cost | ESP32-C3 | $1-3/chip, Wi-Fi, good enough for simple models |
| Lowest power | STM32L4 | < 100 nA shutdown, battery-first design |
| Easiest start | Arduino Nano 33 BLE | Built-in sensors, Arduino IDE, Edge Impulse integration |
Run object detection on ESP32-S3 with TFLite Micro. Hardware specs, compatibility analysis, getting started guide, and alternatives.
Run object detection on STM32H7 with TFLite Micro. 1 MB SRAM, 480 MHz Cortex-M7, CMSIS-NN acceleration for real-time inference.
Deploy anomaly detection on ESP32-C3 with TFLite Micro. Cost-effective sensor monitoring with RISC-V and Wi-Fi connectivity.
Run gesture recognition on Arduino Nano 33 BLE with TFLite Micro. Built-in IMU, Arduino IDE, and the official TFLite gesture tutorial.
Run anomaly detection on STM32F4 with TFLite Micro. Autoencoder-based monitoring on the industry-standard Cortex-M4 platform.
ForestHub is designed to support ESP32, STM32, and Arduino. Pick your MCU, design your model pipeline visually, and generate deployment-ready firmware.
Get Started Free