Tech blog
  • HOME
  • Blog
  • [AmiVoice API Private SDK] Creating a simple "Rule Grammar" [Basic Edition]

[AmiVoice API Private SDK] Creating a simple "Rule Grammar" [Basic Edition]

Published: 2023.12.20 Last updated: 2025.05.19

t-ookura Takashi Okura

Hello everyone!

AmiVoice API Private and SDKRule grammar (grammar recognition)An engine using " is available.

A rule grammar is a system that "recognizes only expressions that follow predetermined grammar (rules)."Set your own rules for what words and sentences are recognizedWe have explained this in a previous article, so please see the link below for more details.

This article is the second in the "Rule Grammar" series and aims to explain the basics of writing rule grammar, helping you to write simple rule grammar.

*For voice recognition using rule grammar, you need to use "Rule Grammar EngineIf you are interested in developing with rule grammar,Individual consultationPlease apply from

Prerequisite: About the format of Rulegrammar

AmiVoice API Private and SDK rule grammar are JSGF(JSpeech Grammar Format or Java Speech Grammar Format) andSRGS(Speech Recognition Grammar Specification). Both JSGF and SRGS are formats for describing speech recognition grammars. Although the expressions are slightly different, they can basically express speech recognition grammars in the same way.

The following explanation will be given using JSGF. The specifications will be explained as needed, but for details,CLICK HEREPlease see below (in English). We will also explain our own rules and rules below.

The rule grammar is a text file, so you can create it using a general text editor. Save the file with the extension ".gram".

A simple example of a rule grammar

First, we will show a simple sample (Sample.gram) that only recognizes "AmiVoice". We will explain how to write a rule grammar based on this sample.

1st line: “#JSGF V1.0 UTF-8;”

This is the header information for JSGF. The specifications are as follows:

"#JSGF V1.0" is a fixed string. The above sample indicates that the character encoding is "UTF-8".We recommend using "UTF-8" as the character encoding unless there are special circumstances.. Also,If you specify UTF-8, don't forget to save the file itself in UTF-8 (without BOM).

You can omit the "Character Encoding Information" field, but please do not omit it as this may lead to problems such as "not being recognized properly."

Line 2: "grammar Sample;"

This is the definition of the grammar name. It has the following specifications:

The characters that can be used for "gramma name" are:Half-width alphanumeric characters and half-width underscores ("_")Half-width hyphens cannot be used.

Basically,We recommend that you use the file name without the extension as the grammar name..

The rule grammar engine can also load multiple gram files, but if the grammar names are the same, a conflict will occur and the engine will not recognize the files correctly. We recommend that you match the grammar name to the file name and save multiple grammar files in the same folder. However, grammar names cannot contain half-width hyphens, so if there is a half-width hyphen in the file name, please replace it with a half-width underscore.

3rd line: “public” = AmiVoice\AmiVoice;”

This is a statement that defines the rule. It defines the phrase you want the speech recognition engine to recognize. It has the following specifications:

The first "public" is "Public rules" We will explain non-public rules in future articles, so for now, please understand that "you put "public" before the rule name."

The "rule name" must be enclosed in "<>". In the sample, the file name and grammar name are "Sample" with the first letter in uppercase, and the rule name is "sample" with the first letter in lowercase, but the rule name is unrelated to the file name and grammar name and can be named freely.

By devising the "rule definition" section, you can create a variety of rule grammars. We will explain the details in the next article, so for now, let's start with "Write the word you want to recognize" Remember that.

You can write alphabets and kanji in the "rule definition", but in that case you must specify the reading in hiragana or katakana.AmiVoice"

Between "notation" and "reading"Half-width backslash ("\")(Depending on the font, it may be displayed as "¥"). Do not put spaces before or after the "¥". This method of adding "reading" isOur unique Rulegram specifications.For words that consist of only hiragana or katakana, you generally do not need to register the reading..

Also, if you want to associate multiple readings with one notation, you can simplyHalf-width slash ("/")For example, "Japan", if you speak either "Nippon" or "Nihon", it will be recognized as "Japan". Again, you should not put spaces before or after the "/".

The characters that can be used for "reading" areOnly hiragana, katakana, "ー" (long vowel character), and "." (half-width period)And,We recommend that you write the reading exactly as it is actually pronounced.See the example below:

"Tokyo" is written in hiragana as "toukyou," but the actual pronunciation is probably "to-kyo-." AmiVoice has an "automatic reading conversion function," which treats the aforementioned "to-kyo-" and "toukyou" as the same thing. However, there is a possibility of unintended automatic conversion, so it is safer to write it as it is actually pronounced.

Another thing to note is that hiragana"Ha" "He"These can also be read as "wa" or "e", but in AmiVoice,Always treat it as "ha" or "he"That is,If you want to read it as "wa" or "e", you need to register the reading even if the word is written in hiragana only.For example, a rule grammar that only recognizes the greeting "hello" is:

If you forget to write the reading, it will be written as "Hello." The(konni ch i ha) and "Hello"(konni ch i wa) may not be recognized.

For more information about word pronunciation in AmiVoice, please see the article below.

How to configure speech recognition using rule grammars

Once you have created a rule grammar, try out speech recognition.Synchronous HTTPThis section explains how to set up the system when using
*As mentioned at the beginning, for speech recognition using rule grammar, you need to use "Rule Grammar Engine" must be used.

When specifying a rule grammar for the synchronous HTTP speech recognition API, the rule grammar must be placed on a web server or similar. Then, specify "Connection Engine Name" in the place where you specify "A string consisting of the connection engine name and rule grammar URL connected by "|""CLICK HERESee also

Specifically, in the link above:

would be rewritten as follows:

For example, if the engine name is "-a-rule-input" and the rule grammar URL is "http://dummy.com/grammars/Sample.gram", specify it as follows:

If Sample.gram is written correctly to recognize only "AmiVoice" and an audio file of "AmiVoice" is input, the following output will be obtained (the output has been formatted):

You can see that it has been correctly recognized as "AmiVoice". If you specify "reading", the notation (AmiVoice) will be output for "written" and the reading (AmiVoice) for "spoken". For hiragana and katakana where "reading" is not specified, the notation as written in the rule definition will be output for both "written" and "spoken". For more information on how to interpret the recognition results,CLICK HERE .

Next, let's use the same Sample.gram to recognize a voice other than "AmiVoice." An example of the output is as follows.

The above recognition result was obtained by using the same Sample.gram to recognize an audio file in which I said "Hello." This rule grammar can only recognize "AmiVoice," so nothing will be output to "tokens" for any other utterances.

trouble shooting

If you input an audio file uttering "AmiVoice" into the Sample.gram above and no recognition results are output, there may be an error in the rule grammar. Some of these were mentioned earlier in the article, but the following are some common mistakes you can make.

  • Does the character encoding specified in the header match the character encoding of the file itself?
  • At the end of the line;Have you forgotten the "(semicolon)?
  • Specify the reading"\"or"/Add a space before and after "Is there anyone there?

Final thoughts

In this article, we have explained the basics of how to write rule grammar. We plan to explain more practical ways to write rule grammar in subsequent articles, so please stay tuned.

Rule Grammar is a feature provided by AmiVoice API Private SDK, but if you are a developer who wants to try out AmiVoice API speech recognition more easily, please try it out. https://acp.amivoice.com/ Give it a try. You can use all engines for free up to 60 minutes of audio per month.

Thank you for reading this far!

Person who wrote this article

  • Takashi Okura

    He joined Advanced Media as a new graduate.
    With experience in research and development to improve the accuracy of voice recognition, he is currently working on projects that will promote Advanced Media's technological capabilities, including this blog.
    My hobbies include traveling (mainly trains), reading (mainly novels), and board games.

     
Use API for Free