AmiVoice SDK

AmiVoice SDK, for which over 5,000 licenses have been installed, is used for voice input and speech-to-text conversion in various usage scenarios due to its high recognition accuracy. The library also provides audio recording functions, making it easy to develop native apps for various operating systems. By linking with the speech recognition server (AmiVoice API Private), operation is also possible regardless of device specifications.

AmiVoice SDK Features

  • Available offline
    More secure operation

    In the standalone setup, speech recognition can be used even in environments where it is difficult to connect to the Internet from a security perspective, such as in medical settings.
  • Word and command recognition also possible
    For voice input and voice control

    It is also ideal for recognizing combinations of product names and numbers, numerical values, letters, and commands such as "confirm" and "next" (rule grammar recognition). Naturally, it also recognizes sentences with high accuracy.
  • Noise-resistant engine
    For developing outdoor/factory applications

    Highly customizable, allowing adjustment of the recognition parameters to suit the site, and recognition of technical terms by pre-registering them. Can also be used for embedding in devices used in manufacturing sites and outdoor inspection and maintenance work, and in app development.
  • Wide-ranging support options for development with peace of mind

    For technical inquiries about how to use the SDK, ①technical staff will directly support you ②in Japanese. We also provide training on how to use the SDK upon request.

Choose the speech recognition type that suits your needs

  • Type 1: Rule grammar (word/command) recognition

    It only recognizes standard phrases and words that have been set up by you in advance. Use it when you want to operate the device by voice or input pre-prepared words and phrases by voice.
  • Type 2: Dictation (text) recognition

    It converts speech, such as "spoken language" or "written language," that does not fall within the scope of the rules into text. It displays the text that is calculated to be closest to the spoken content.

    * Standalone recognition speed depends on device specifications. * If you prioritize recognition speed, we recommend using AmiVoice API Private.

  • Customizable engine

    In addition to the "general-purpose engine" that can be used in a variety of situations and businesses, we have engines specialized in technical terms and industry terminology, such as those in theMedical field. We can also customize existing engines for our customers to achieve maximum rates of recognition.
  • Speaker diarization function

    This function identifies who spoke and when for audio containing multiple speakers. It identifies and labels speakers without prior training.

AmiVoice SDK in 5 minutes

Get Materials

AmiVoice SDK Case Studies

We also recommend these services:

Use API for Free