Tech Blog
  • HOME
  • Blog
  • [AmiVoice API Private SDK] How to write specific rule grammar from an application developer's perspective (Numerical Input Edition)

[AmiVoice API Private SDK] How to write specific rule grammar from an application developer's perspective (Numerical Input Edition)

Published: 2025.07.02 Last updated: 2025.08.29

Hello everyone.
There are three articles about rule grammar for AmiVoice: Basic, Practical, and Advanced. Here, we will specifically introduce how to use these basic, practical, and advanced articles in application development, using numerical input as an example.

*Please read the following article for an explanation of the rule grammar file, including how to write headers, grammar name definitions, and public rules. Here, we will focus on how to write public rules that will be used for speech recognition.

Recognizing single-digit numbers

For example, if you want to write a rule grammar to recognize only one digit of a number between 1 and 3, you can write it in parallel as follows:

Written in series, these four public rules are as follows:

Both parallel and serial methods recognize the single-digit characters "zero," "one," "two," and "three," but you can see that it is better to combine them into one parallel public rule rather than writing serial public rules for each number you want to recognize. If you want to write a rule grammar that recognizes single-digit numbers from 0 to 9, simply extend the parallel example.

Other examples:

If you want to count numbers as "1", "2", "3" or "1", "2", "3", you can write the rule grammar as follows. The parts enclosed in [ ] are optional.

Recognizing multi-digit numbers

When writing a rule grammar for speech recognition of phone numbers, for both 10-digit and 11-digit numbers, the rule will be a combination of three public rules and one private rule as shown below.

The three public rules are designed to enable speech recognition for the three patterns of utterances "○○'s ○○○○'s ○○○○," "○○○○'s ○○○○'s ○○○○," and "○○○○'s ○○○○'s ○○○○," even if "no" is omitted.
The key points of the rule grammar that allows speech recognition of this phone number are:The same parts of each public rule are made into private rules to make them common components.The points thatAccepts numeric utterances with the expected number of digitsThe key point is that we have done so in this way.
If you do not create common components using private rules, many similar statements will be written in the public rules, and if there is a mistake somewhere, it will take a lot of effort to correct it.

Also, if you have a rule grammar that recognizes 10-digit and 11-digit numbers with speech,You can also write it as follows using the repeat symbol "+", but this will recognize all numbers with one or more digits.So, even if you stop speaking at "ichi ni san", the speech recognition will be successful and "1 2 3" will be output as the speech recognition result. If you think about the application wanting to receive a phone number, but receiving the three-digit number "1 2 3" as the speech recognition result,The application must implement error handling when receiving a number of digits that is not a phone number.

Recognizing numbers with digit reading (place reading)

In the above example, the number "123" is read as "ichi-ni-san."Grain readingWe have introduced rule grammars that recognize "123" as "hyakunijusan".Column readingThe rule grammar that recognizes can be written as follows: 

Digit reading from 1 to 999Structured in the form of "hundreds place," "tens place," and "ones place."Therefore, what is written in the private rule is referenced in the public rule. When reading digits, 0 is spoken alone as "zero" or "rei." For this reason, 0 is not included in the ones digit, but is written in parallel as numbers 1 to 999 so that it can be recognized by voice alone.

In the Basic, Practical and Advanced sections of Rule Grammar, which have already been explained on the Tech Blog, we introduced a sample of the speech recognition results returned by AmiVoice API Private, but the speech recognition results returned by AmiVoice SDK are the following strings:

Example: Speech recognition result string when the utterance "1023" is recognized using the above public<0_999>

In this example, the result of speech recognition was run on Windows, so "¥u0001" is the delimiter code. If you express it as the result of speech recognition using AmiVoice API Private, it will look like this: 

It can be separated into: 

If you want to treat the speech recognition results as a number, you can split the written string "100|20|3" into 100, 20, and 3, then parse the string into integers and add them up to get 100 + 20 + 3 = 123, which means the application will be able to treat the spoken "102-3" as the integer 123. 

Summary

Thank you for reading this far. There are two points to keep in mind when writing a rule grammar for real-world use: 

  • It is convenient to make the same description a private rule and call it from a public rule.
  • Write rule grammars that take into account actual operations.

When trying to implement voice input using a rule grammar, it is sometimes difficult to write a rule grammar that allows voice input exactly as you envisioned. Also, as you continue to use the system, you may find it necessary to change the content of the rule grammar.Combining private and public rules effectively to write a structured rule grammar will result in a good rule grammar that is easy to maintain and suitable for operation.So please give it a try. 

Person who wrote this article

  • Katsuaki Miyauchi

    I'm a game developer who joined Advanced Media because I thought it would be interesting to combine games and voice recognition. However, I haven't had any game work for a while now, and I'm feeling a bit lonely.

     
Use API for Free