On-device keyword spotting and voice command recognition without cloud connectivity. Typical models use DS-CNN or depthwise separable convolutions to classify short audio segments into predefined command categories. Privacy-preserving since audio never leaves the device.
| Minimum RAM | 128 KB |
| Minimum Flash | 1024 KB |
| Sensor Inputs | microphone |
| Typical Model Size | 80 KB (quantized int8) |
ESP32
Espressif
520 KB RAM · 240 MHz
$5–$15 (dev board)
ESP32-C3
Espressif
400 KB RAM · 160 MHz
$4–$10 (dev board)
ESP32-C6
Espressif
512 KB RAM · 160 MHz
$5–$15 (dev board)
ESP32-S3
Espressif
512 KB RAM · 240 MHz
$10–$25 (dev board)
i.MX RT1062
NXP
1024 KB RAM · 600 MHz
$25–$40 (dev board)
nRF52840
Nordic Semiconductor
256 KB RAM · 64 MHz
$20–$35 (dev board)
RA6M5
Renesas
512 KB RAM · 200 MHz
$25–$50 (dev board)
STM32F4
STMicroelectronics
192 KB RAM · 168 MHz
$10–$30 (dev board)
STM32F7
STMicroelectronics
512 KB RAM · 216 MHz
$25–$60 (dev board)
STM32H7
STMicroelectronics
1024 KB RAM · 480 MHz
$30–$80 (dev board)
STM32L4
STMicroelectronics
128 KB RAM · 80 MHz
$15–$50 (dev board)
STM32U5
STMicroelectronics
786 KB RAM · 160 MHz
$20–$50 (dev board)
Possible
The Arduino Nano 33 BLE Sense runs keyword spotting with Edge Impulse using its built-in MP34DT05 microphone. The 256 KB SRAM handles small …
Good
The ESP32-C3 handles voice recognition effectively with Edge Impulse. 400 KB SRAM at 160 MHz provides 3.1x headroom over the 128 KB requirem…
Excellent
For voice recognition, the ESP32-C6 with Edge Impulse scores Excellent. Its 512 KB internal SRAM (4.0x the required 128 KB) and 160 MHz cloc…
Excellent
For voice recognition, the ESP32-C6 with TFLite Micro scores Excellent. Its 512 KB internal SRAM (4.0x the required 128 KB) and 160 MHz cloc…
Excellent
For voice recognition, the ESP32-S3 with Edge Impulse scores Excellent. Its 512 KB internal SRAM (4.0x the required 128 KB) and 240 MHz cloc…
Good
The ESP32-S3 handles on-device keyword spotting with TFLite Micro using DS-CNN models that classify 1-second audio windows into predefined c…
Excellent
The ESP32 is an excellent match for voice recognition with Edge Impulse. 520 KB SRAM delivers 4.1x the 128 KB minimum while 240 MHz processe…
Excellent
For voice recognition, the ESP32 with TFLite Micro scores Excellent. Its 520 KB internal SRAM (4.1x the required 128 KB) and 240 MHz clock e…
Excellent
NXP's i.MX RT1062 excels at voice recognition via CMSIS-NN. The 1-core cortex-m7 at 600 MHz with 1024 KB SRAM handles 80 KB quantized models…
Excellent
The i.MX RT1062 is an excellent match for voice recognition with TFLite Micro. 1024 KB SRAM delivers 8.0x the 128 KB minimum while 600 MHz p…
Good
The nRF52840 handles voice recognition effectively with Edge Impulse. 256 KB SRAM at 64 MHz provides 2.0x headroom over the 128 KB requireme…
Good
Running voice recognition on the nRF52840 with TFLite Micro is practical. 256 KB SRAM meets the 128 KB minimum with 2.0x headroom. The 64 MH…
Excellent
For voice recognition, the RA6M5 with CMSIS-NN scores Excellent. Its 512 KB internal SRAM (4.0x the required 128 KB) and 200 MHz clock ensur…
Excellent
Renesas's RA6M5 excels at voice recognition via TFLite Micro. The 1-core cortex-m33 at 200 MHz with 512 KB SRAM handles 80 KB quantized mode…
Good
STMicroelectronics's STM32F4 is a solid choice for voice recognition using Edge Impulse. The cortex-m4f core at 168 MHz with 192 KB SRAM acc…
Good
The STM32F4 handles voice recognition effectively with TFLite Micro. 192 KB SRAM at 168 MHz provides 1.5x headroom over the 128 KB requireme…
Excellent
STMicroelectronics's STM32F7 excels at voice recognition via CMSIS-NN. The 1-core cortex-m7 at 216 MHz with 512 KB SRAM handles 80 KB quanti…
Excellent
For voice recognition, the STM32F7 with TFLite Micro scores Excellent. Its 512 KB internal SRAM (4.0x the required 128 KB) and 216 MHz clock…
Excellent
STMicroelectronics's STM32H7 excels at voice recognition via CMSIS-NN. The 1-core cortex-m7 at 480 MHz with 1024 KB SRAM handles 80 KB quant…
Good
The STM32H7 runs keyword spotting and voice command recognition with TFLite Micro using CMSIS-NN accelerated inference. The 1 MB SRAM and 48…
Good
STMicroelectronics's STM32L4 is a solid choice for voice recognition using Edge Impulse. The cortex-m4f core at 80 MHz with 128 KB SRAM acco…
Excellent
For voice recognition, the STM32U5 with CMSIS-NN scores Excellent. Its 786 KB internal SRAM (6.1x the required 128 KB) and 160 MHz clock ens…
Excellent
The STM32U5 is an excellent match for voice recognition with TFLite Micro. 786 KB SRAM delivers 6.1x the 128 KB minimum while 160 MHz proces…
ForestHub compiles visual AI workflows to C code for your microcontroller. Choose your hardware, build your voice recognition pipeline, deploy in minutes.