Tech blog
  • HOME
  • Blog
  • I tried to recognize microphone input using Python

I tried to recognize microphone input using Python

Published: 2022.07.19 Last updated: 2025.03.04

Ichikawa Ichikawa-chan

Hello, this is Ichikawa-chan.

I work in development at a company called Advanced Media.

I created a Python speech recognition program for myself, so I'd like to share it with you.

github.com

The construction is a bit rough, so please use it as a reference only.

It is running in the following environment.

OS ubuntu18.04
ア ー キ テ ク チ ャ AMD64
GCC 7.5.0

procedure

1. Register with AmiVoice API

2. Download and run the program

How to move

1. Register for the AmiVoice API.

acp.amivoice.com

Once registered, copy the APPKEY

 2. Download and run the program

You can start voice recognition with the following command. Paste the APPKEY you copied in step 1 where XXX is.

$git clone https://github.com/r-ichikawa-amivoice/ami_speechrecognizer_py
$ bash run.sh a=XXX

After that, just speak appropriately and you'll get the results.

Commentary

main.py supports the following parameters:

a Specify the APPKEY for the AmiVoice API site.
r Specifies the format of the input.
Specifying "mic" will input from the microphone.
Any other strings will be treated as audio file names and will be recognized as files.
o Specifies the log output format.
Specifying "console" will output the log to the console.
If you specify “date”, the log will be output as a yyyy-MM-dd.txt file.
Any other string will be treated as a file name and the log will be output to the file with that name.
l Specify the log level. Log levels equal to or greater than the specified value will also be output.
“0”:DEBUG - Debugging log
“1”: INFO - Normal log
“2”:WORN - Warning log
“3”:ERROR - Error log
If you want to change the microphone

Currently, information is collected from a microphone called pulse.

If you want to change it, please adjust the values ​​around rec.audio_source.

To select a device, use rec.get_device() and specify the index of the device you want to use.

To change the number of channels

Change the value of rec.audio_format[“CHANNELS”].

However, since the AmiVoice API currently only supports one channel, it is safer not to change it.

If you want to split the channel and pass it, it might be a good idea to include a source like the one below.

from functools import partial def parse(audio, length, obj, amivoice): data = obj.channel_parse(audio) amivoice.write(data[0], int(length/obj.audio_format["CHANNELS"])) rec.recorder_write_func = partial(parse, obj = rec, amivoice = stt)
Things to consider

I'm not sure if the /amivoice folder is the latest source.

If you want to update, you can get it from here:

acp.amivoice.com

Summary

It is now possible to perform speech recognition using Python.

Person who wrote this article

  • Ichikawa-chan

    There are only two types of people in the world: me and the rest of us.

Use API for Free