Tech blog
-
2026.02.19AmiVoice API Update: End-to-End ASR–Ready ”Keyword Boosting”
This article explains the mechanism and practical use of "word emphasis," which is now available in end-to-end speech recognition. It covers the differences in behavior compared to hybrid speech recognition, the key points of setting parameters, and tips for improving accuracy. It's a must-read for anyone who wants to get the most out of E2E.
-
2026.01.30Easily synthesize subtitles into videos! Subtitle workflow created with speech recognition API
A must-see for anyone who wants to easily add subtitles to videos! We'll show you how to efficiently extract audio, synthesize subtitles, and watch videos in just four steps. Using an actual prime ministerial press conference as a subject, we'll carefully explain the process of generating subtitles using a speech recognition API.
-
2026.01.16AmiVoice's word registration API gives you more freedom in speech recognition!
One of the features of AmiVoiceAPI is the "word registration function." Did you know there is an API for this function? The word registration API allows for flexible word registration, including integration into apps and management of multiple profiles (dictionaries). This article provides an easy-to-understand explanation of how to use the word registration API and some useful scenarios.
-
2025.12.22Easily develop speech recognition apps with Dify x AmiVoice API
Why not try using the low-code environment Dify to efficiently implement a speech recognition app without complex code? We'll show you how to intuitively build long-term speech recognition, Slack notifications, and LLM integration using the "AmiVoice API". This content is also useful for non-developers.
-
2025.12.01What Is Speech Segment Detection?
Are you familiar with "voice activity detection", a system that detects only "where a person is speaking"? Not only does it filter out noise and hold music, improving recognition accuracy, but it also has the benefit of "paying only for the time you're speaking". We'll explain in an easy-to-understand way a new perspective on choosing a speech recognition service.
-
2025.10.28Building a serverless web app with AWS
An AWS infrastructure engineer, inspired by serverless technology, built a web app with a login function using AWS services, Vue3, and in-house GPT. Running costs were as low as 3 yen per month. This practical account provides a realistic look at the appeal and challenges of serverless, including comparisons with legacy configurations and actual costs.
-
2025.09.08We exhibited at "AWS Summit Japan 2025"
AmiVoice API was exhibited at the "AWS Summit Japan 2025" held in June. It introduced the latest solution that combines speech recognition and generative AI, attracting the attention of many visitors. The exhibit in the "Generative AI Course" tour was particularly successful. Here's a summary of the event.
-
2025.08.27[I made it!] A convenient tool with speech recognition and generation AI - No-code creation with the AmiVoice library and Power Automate
I wanted to turn conversations with customers at exhibitions into text notes and share them! With that in mind, I created a free, no-code system using iPhone voice memos. This is a practical report on how I used Power Automate Desktop to automate the process, from converting speech to text, to creating titles using generative AI, and posting them to Teams.
-
2025.07.29[We Tested It!] How does speech recognition accuracy change with sampling rate and compression rate?
Have you ever wondered things like, "Won't the accuracy of speech recognition decrease unless the data is of high quality?" or "I want to reduce the file size, but how much compression is safe?" when using a speech recognition service? This time, we actually tested the effects of sampling rate and compression rate on speech recognition accuracy.
-
2025.07.02[AmiVoice API Private SDK] How to write specific rule grammar from an application developer's perspective (Numerical Input Edition)
AmiVoice's "Rule Grammar" recognizes only expressions that follow predetermined grammar rules. We will use numerical input as an example to show how it can be used in actual application development.
-
2025.05.20[AmiVoice API Private SDK] Creating a "Rule Grammar" for Advanced Users [Advanced Edition]
We will explain advanced techniques for "rule grammar" that can be used with AmiVoice API Private SDK, such as "repetition," "private rules," and "tags."
-
2025.03.11Prerequisites for developing systems using speech recognition <Part 2> - Development Know-how Series 2 -
From the perspective of a speech recognition system developer, we will explain useful knowledge to know before introducing speech recognition. This time, we will cover the prerequisites for introducing speech recognition (Part XNUMX).
Most viewed articles
- A quick explanation of how voice recognition works!
- Comparing the speech recognition rates of OpenAI's Whisper and AmiVoice for "conference" audio
- How to use the AmiVoice API free coupon
New columns
- AmiVoice API Update: End-to-End ASR–Ready ”Keyword Boosting”
- Easily synthesize subtitles into videos! Subtitle workflow created with speech recognition API
- AmiVoice's word registration API gives you more freedom in speech recognition!
Category list
- Introduction to Speech Recognition (15)
- How to improve voice recognition accuracy (12)
- I tried developing it (27)
- How to use AmiVoiceAPI(25)
- Comparison and Verification (5)
- Others(9)
