FFmpeg 7.1
Since* 4.2
#

Automatic Speech Recognition

This filter uses PocketSphinx for speech recognition. To enable compilation of this filter, you need to configure FFmpeg with --enable-pocketsphinx.

It accepts the following options:

rate

Set sampling rate of input audio. Defaults is 16000. This need to match speech models, otherwise one will get poor results.

hmm

Set dictionary containing acoustic model files.

dict

Set pronunciation dictionary.

lm

Set language model file.

lmctl

Set language model set.

lmname

Set which language model to use.

logfn

Set output for log messages.

The filter exports recognized speech as the frame metadata lavfi.asr.text.