Tech blog
  • HOME
  • Blog
  • [I made it!] A convenient tool with speech recognition and generation AI - No-code creation with the AmiVoice library and Power Automate

[I made it!] A convenient tool with speech recognition and generation AI - No-code creation with the AmiVoice library and Power Automate

Published: 2025.08.27 Last updated: 2025.08.27

Spice Dog

Hello.
On a particularly hot day in June, we exhibited AmiVoiceAPI at a development exhibition.
Thank you very much to everyone who visited our booth.

It's great that we use it ourselves!Based on this concept, we would like to introduce the "voice memo using AmiVoiceAPI" that was used at the exhibition.

The theme is#GenerativeAI, #NoCode, #Free  で す

Now, I would like to take notes of the requests and valuable comments from visitors so I don't forget them, but above all, I would like to talk to as many people as possible, so I would like to save time writing things down and using the computer.
So, record your voice on your smartphone and post it as an article on Teams!
The steps are as follows:

  • Record your voice with iPhone's voice memo and save it to the file server from the venue.
  • Convert to text using AmiVoiceAPI
  • Generative AI creates text and masks confidential information
  • Post to Microsoft Teams

I implemented this process using Power Automate Desktop (PAD), which is free for anyone with Windows.
Anyway, I'll show you what I posted.

  • Displaying Microsoft Teams channels on an Apple smartphone
  • The top row shows text generated using generative AI. The title is also automatically generated.
  • The bottom row is the text as it is recognized by AmiVoiceAPI

This system was created using no-code. Here are some tips on how to build it.

1. Using Power Automate Desktop

To automate a series of tasks, we created a new "flow" in Power Automate Desktop, which runs programs and batch files continuously.

One thing that the free version of Power Automate Desktop cannot do is trigger the execution of flows.
If there was a trigger, I would have executed the flow by "uploading an audio file," but this time I registered the Power Automate Desktop flow in the task scheduler on the computer I left at work, and set it to start periodically and check whether a new audio file had been generated.

2. Record audio on the spot with the iPhone's native Voice Memo

Since the smartphone I use for work is an iPhone, I used the native iOS "Voice Memos" app.


iPhone voice memos use the efficient compression format m4a as the audio file format, but AmiVoiceAPI does not currently use this format, so we included a flow in Power Automate Desktop that uses FFMpeg installed on the computer to convert from m4a to mp3 format.

3. Voice recognition using AmiVoiceAPI is a sample program provided by AmiVoice.

The voice recognition is a sample program using the client library provided by AmiVoide.WrpSimpleTesterThis can be achieved by using it as is.

We used the general-purpose engine a-general for speech recognition, but AmiVoiceAPI offers engines that are optimal for medical, financial, and insurance applications, so please choose the engine that best suits your needs.
Voice recognition servicesAmiVoice homepageYou can create an account and start using it anytime. It's free for up to 60 minutes per month.

4. GPT4.1 was chosen as the generating AI

We used the AzureOpenAI cloud API to automatically generate bullet points and titles.
Please select your LLM as per your preference.
Each model has its own characteristics, so we will investigate how well they work with AmiVoiceAPI voice recognition results and explain them in the next issue.
The instructions for this generation AI are here!

"You are a helpful assistant. Please list the following in bullet points. For the first line of your bullet points, create a concise title that fits the sentence and display it between square brackets. If the first line contains characters that resemble a password or PIN code, mask them with asterisks."

5. Posting to Teams is HTTP POST using adaptive cards

Using Teams' Adaptive Card template, you can easily create messages decorated with the appropriate design, font, etc. If you create a Webhook URL (destination endpoint) in the destination channel, you can post using HTTP POST.

After posting to a channel, you can easily notify team members using the mention function.

bonus

I tried using a voice recognition + generation AI to post something like this.

-How would a generative AI help a Kansai resident's troubled day?

ー原文-
今日はな、朝からお客さんとこに直行して、「トリリンガルスパーク」っちゅう新製品のプロモーションする予定
やってんけど、人身事故で電車がめっちゃ遅れてるって分かってな、「あかん、これ間に合わへんわ」
ってなって、急きょリモートで打ち合わせできへんか提案してん。

そしたらな、急なお願いやったのに、営業さんと技術の人、合わせて3人も参加してくれて、
ほんまにありがたいことに、いろいろ貴重な意見もろたわ。

ほんで午後は、電車の遅れも解消してたから、有明でやってたCRMの展示会に行ってきてん。
パートナーの人から最新の情報も聞けて、なかなかええ収穫やったで。

明日は、いつも通りオフィスに出社する予定やわ。

Posting results 
- At the same time as losing the Kansai vibe, the sense of having overcome trouble seems to have faded...

That's all for this article. Thank you for reading this far.
Other tech blogs are also interesting, so if you're interested, take a look at the archives.

Person who wrote this article

  • Spice Dog

    A brown-colored engineer with a deep love for Shiba Inu dogs and Indian curry, he decided to take a fresh start and dive into the world of voice recognition, driven by a desire to add some spice to his daily life.
    Recently, I noticed that my dog ​​can distinguish the engine sounds of Subaru cars, and I've been secretly daydreaming about whether a dog's hearing ability could be used in voice recognition technology.

     
Use API for Free