Guide

Best Microcontroller for Machine Learning

The ESP32-S3 is the strongest general-purpose choice for machine learning among popular MCU families. It combines 512 KB SRAM, 8 MB PSRAM, SIMD vector instructions, and a camera interface at $10-25 per dev board — one of the strongest price-to-ML-capability ratios among mainstream MCU platforms.

Published 2026-04-01

What Makes a Good ML Microcontroller

Not every microcontroller can run machine learning. The minimum requirements are:

  • SRAM: At least 64 KB free for the tensor arena (model working memory)
  • Flash: Enough to store both the firmware and the model (typically 200 KB - 1 MB for the model alone)
  • Clock speed: 80 MHz minimum for real-time inference
  • Architecture: ARM Cortex-M4 or higher, Xtensa LX6/LX7, or RISC-V with sufficient performance
  • Framework support: TensorFlow Lite Micro or Edge Impulse must support the chip

Below that threshold, inference either does not run or is too slow for practical use.

MCU Comparison for Machine Learning

SpecESP32-S3STM32H7ESP32ESP32-C3STM32F4Arduino Nano 33 BLE
ProcessorXtensa LX7 (dual)Cortex-M7Xtensa LX6 (dual)RISC-V (single)Cortex-M4FCortex-M4F
Clock240 MHz480 MHz240 MHz160 MHz168 MHz64 MHz
SRAM512 KB1024 KB520 KB400 KB192 KB256 KB
FlashUp to 16 MB2 MBUp to 16 MBUp to 16 MB1 MB1 MB
ML AcceleratorSIMD vectorFPU + cacheNoneNoneFPUFPU
Camera InterfaceYesYes (DCMI parallel interface)NoNoNoNo
Wi-FiYesNo (Ethernet)YesYesNoNo
Dev Board Price$10-25$30-80$5-15$4-10$10-30$20-35

Best for Each Use Case

Vision and Image Classification: ESP32-S3

The ESP32-S3 for object detection is the strongest option for camera-based ML on an MCU. The combination of a hardware camera interface, SIMD vector instructions that accelerate quantized operations, and up to 8 MB PSRAM for image buffers makes it the only sub-$25 board that handles image classification practically.

The STM32H7 has a DCMI parallel camera interface with DMA support, but its 1 MB SRAM is more constrained for large image buffers compared to the ESP32-S3’s external PSRAM. The ESP32-S3 remains the most cost-effective option combining camera, Wi-Fi, and ML acceleration.

Maximum Compute Power: STM32H7

The STM32H7 runs at 480 MHz with 1 MB SRAM, L1 cache, and a double-precision FPU. For models that need raw throughput — large anomaly detection, complex signal processing, or audio with high sample rates — the H7 is one of the fastest Cortex-M7 options available.

The trade-off: no Wi-Fi (Ethernet only), more expensive ($30-80 per board), and a steeper development setup through STM32CubeIDE.

Budget and IoT: ESP32-C3

At $1-3 per chip, the ESP32-C3 is the cheapest ML-capable MCU. Its 400 KB SRAM and 160 MHz RISC-V core handle anomaly detection and simple classification. Add Wi-Fi and BLE for IoT connectivity.

The single core and missing SIMD limit it to simpler models. Do not use it for vision or complex audio processing.

Ultra-Low Power: STM32L4

The STM32L4 for anomaly detection draws under 100 nA in shutdown mode. For battery-powered applications — asset tracking, environmental monitoring, wearables — the L4 runs small models on its 128 KB SRAM while sipping power.

The 80 MHz clock and 1 MB flash limit it to models under 200 KB. Sufficient for anomaly detection and simple classification, not for vision or complex audio.

Fastest Start: Arduino Nano 33 BLE

The Nano 33 BLE is the easiest on-ramp. The Sense variant includes a 9-axis IMU, microphone, and gesture/color/proximity sensors. Combined with the Arduino IDE and Edge Impulse’s direct integration, you can go from zero to running inference in an afternoon.

The 256 KB RAM and 64 MHz clock limit it to smaller models. The built-in sensors make it ideal for gesture recognition, keyword spotting, and sensor-based anomaly detection without any external wiring.

The Wireless Question

If your application needs Wi-Fi or BLE for data transmission, your choices narrow significantly:

  • Wi-Fi + BLE: ESP32, ESP32-S3, ESP32-C3
  • BLE only: Arduino Nano 33 BLE
  • Ethernet: STM32H7
  • No connectivity: STM32F4, STM32L4 (need external modules)

For IoT applications that send inference results to a gateway or cloud dashboard, the ESP32 family is the default choice. The STM32 family is better for standalone or wired industrial applications.

Framework Support

Both TensorFlow Lite Micro and Edge Impulse support all MCUs listed here. The practical differences:

TFLite Micro gives you full control. You manage the model conversion, operator registration, and memory allocation. Better for teams with embedded experience who need to optimize every byte.

Edge Impulse handles the full pipeline — data collection, training, quantization, and export as a ready-to-compile C++ library. Better for rapid prototyping and teams without deep ML expertise. The trade-off is a larger binary (the SDK includes its own inference pipeline) and cloud dependency for training.

Recommendation Summary

PriorityChooseWhy
Best all-round MLESP32-S3SIMD, camera, PSRAM, Wi-Fi, $10-25
Max performanceSTM32H7480 MHz, 1 MB SRAM, FPU + cache
Lowest costESP32-C3$1-3/chip, Wi-Fi, good enough for simple models
Lowest powerSTM32L4< 100 nA shutdown, battery-first design
Easiest startArduino Nano 33 BLEBuilt-in sensors, Arduino IDE, Edge Impulse integration

Frequently Asked Questions

Can you run machine learning on an Arduino?
Yes. The Arduino Nano 33 BLE (ARM Cortex-M4F, 256 KB RAM) runs TFLite Micro models for gesture recognition, keyword spotting, and anomaly detection. It cannot handle vision models — use the ESP32-S3 for camera-based ML.
How much RAM do you need for machine learning on a microcontroller?
Minimum 64 KB for simple models (anomaly detection). 256 KB handles keyword spotting and gesture recognition. 512 KB+ with PSRAM is needed for image classification or object detection.
Is RISC-V good for machine learning?
The ESP32-C3 (RISC-V, 400 KB RAM) handles simple ML workloads like anomaly detection. But it lacks the SIMD instructions and multi-core performance of the ESP32-S3 (Xtensa LX7). For ML-heavy workloads, Xtensa or ARM Cortex-M7 currently outperform RISC-V MCUs.
What is the cheapest MCU that can run AI?
The ESP32-C3 starts at $1 per chip and can run small TFLite Micro models. For $4-10 you get a dev board with Wi-Fi and BLE. It handles anomaly detection and simple classification but not vision tasks.

Related Hardware Guides

Explore More

Build Your Edge AI Workflow

ForestHub is designed to support ESP32, STM32, and Arduino. Pick your MCU, design your model pipeline visually, and generate deployment-ready firmware.

Get Started Free