Hailo Speech to Text

June 19, 2025 by

James Henry

Hailo and Speech Recognition on the ALC-4096-AIH

As edge AI continues to evolve, so too does its ability to interpret and interact with the world in more human-centric ways. A notable example is Hailo’s recent release of a speech recognition application that runs OpenAI’s Whisper-tiny model on the Hailo-8 and Hailo-8L accelerators. This demonstration brings conversational interfaces to the edge with impressive efficiency, ideal for embedded platforms like our Digital View ALC-4096-AIH.

The ALC-4096-AIH board integrates an AI accelerator socket for Hailo-8, making it an ideal platform for deploying edge-native machine learning applications. Designed as an LCD controller board while enabling AI workloads, the board supports CM5-based Raspberry Pi modules, extensive I/O, and flexible integration. Hailo’s Whisper demo highlights a compelling use case—real-time, low-power, on-device speech transcription.

Whisper at the Edge

Hailo’s example uses OpenAI’s Whisper-tiny model to transcribe spoken English using a 10-second audio capture window. While the model is compact, it’s accurate enough for command-and-control, interactive kiosks, or accessibility applications, especially where network connectivity is intermittent or cloud-based processing is not viable.

What makes this demo practical is that it runs inference entirely on the Hailo-8 accelerator. Preprocessing still depends on PyTorch, but future releases aim to eliminate that dependency, moving closer to fully optimized edge deployment. Notably, Hailo also hints at support for model conversion scripts, C++ implementations, and enhanced post-processing pipelines in future updates.

Bringing It to the ALC-4096-AIH

For developers using the ALC-4096-AIH, this Whisper demo offers a compelling starting point. Whether you’re building voice-controlled industrial equipment, multilingual smart displays, or interactive digital signage, the ALC-4096-AIH + Hailo-8 combination brings high-performance speech recognition to your display-based edge AI application.

Running the application requires a supported environment (Ubuntu 22 or Raspberry Pi OS), a USB microphone, and the HailoRT SDK. The demo is open-source and actively maintained on GitHub by Hailo, making it easy to explore and adapt.

See: https://github.com/hailo-ai/Hailo-Application-Code-Examples/tree/main/runtime/hailo-8/python/speech_recognition

Edge AI Expands

This demo is another example of how the ALC-4096-AIH unlocks edge AI potential for developers. With audio processing now part of the toolkit alongside video analytics and computer vision, the possibilities for intelligent displays continue to grow.

We’re excited to see how developers will take advantage of this capability, whether to create more intuitive UIs, accessible environments, or automated inspection systems.

Explore the full capabilities of the ALC-4096-AIH at digitalview.com, and if you’re working on voice interfaces at the edge, we’d love to hear from you.

Setup Guide

Adapted from: https://github.com/hailo-ai/Hailo-Application-Code-Examples/tree/main/runtime/hailo-8/python/speech_recognition

Prerequisites

Ensure you have the following:

Platform: Raspberry Pi CM5 (hosted by ALC-4096-AIH)
OS: Raspberry Pi OS 64-bit
AI Module: Hailo-8 or 8L installed
Hailo SDK: Install hailo-all package (includes HailoRT, drivers, and tools)
→ Get it from the Hailo Developer Zone
Audio tools:

sudo apt update  
sudo apt install ffmpeg libportaudio2

Python: Version 3.10 or 3.11 installed

Install the Whisper Demo

git clone https://github.com/hailo-ai/Hailo-Application-Code-Examples.git
cd Hailo-Application-Code-Examples/runtime/hailo-8/python/speech_recognition

Run the setup script to install Python dependencies:

python3 setup.py

Activate the virtual environment:

source whisper_env/bin/activate

On Raspberry Pi, no additional HailoRT wheel installation is needed if hailo-all was installed system-wide.

Before You Run

Plug in a USB microphone or use a webcam with a mic
Make sure input volume is medium to high
The app captures up to 10 seconds of audio
Only English is supported for now

Run the App

In the activated environment:

python3 -m app.app_hailo_whisper

If using a Hailo-8L:

python3 -m app.app_hailo_whisper --hw-arch hailo8l

To reuse previous audio:

python3 -m app.app_hailo_whisper --reuse-audio

Notes & Customization

This is a functional demo, not yet optimized for real-world production
Preprocessing uses PyTorch but will be removed in future releases
You can modify transcription logic, post-processing, or integrate it into a larger app
Upcoming Hailo updates will include model conversion scripts and C++ support

James Henry June 19, 2025