非同期HTTP音声認識APIの利用体験 - AmiVoice Cloud Platform

curlコマンドを使って、非同期HTTP音声認識APIを利用してみます。以下の例では、サンプルプログラムに含まれている音声ファイルを使います。

ダウンロード

サンプルプログラムをダウンロードします。以下では同梱している音声ファイルのみ利用します。

引数指定

実行例で指定されている引数は次の通りです。

音声認識APIエンドポイント:
 https://acp-api-async.amivoice.com/v1/recognitions
(ログ保存あり・なしどちらも同じエンドポイントです)
音声ファイル:
 test.wav(サンプルプログラムに同梱)
音声フォーマット:
 16K
接続エンジン名:
 -a-general(会話_汎用)
<AppKey>:
 マイページに表示される[APPKEY] を指定してください。

音声認識のリクエストの実行

音声認識のリクエストのときに指定するパラメータは、同期HTTP音声認識APIと全く同じです。

$ curl -X POST -F a=@../../audio/test.wav "http://acp-api-async.amivoice.com/v1/recognitions?d=-a-general&u=<AppKey>"
{"sessionid":"017c25ec12c00a304474a999","text":"..."} 

ジョブの状態の取得

リクエストで得られたsessionidを使ってジョブの状態や結果を得ることができます。音声認識結果が得られるまで複数回実行することになります。[APPKEY]はAuthorizationヘッダに指定してください。

$ curl -H "Authorization: Bearer {AppKey}" https://acp-api-async.amivoice.com/v1/recognitions/017c25ec12c00a304474a999 

受け付けられたジョブは、キューに入れられ、順に処理されます。結果取得用のエンドポイントのレスポンスでジョブの状態を得ることができます。
リクエストを送った直後は、statusは”queued”の状態になります。

{"service_id":"{YOUR_SERVICE_ID}","session_id":"017c25ec12c00a304474a999","status":"queued"}

キューからジョブが取り出されるとstatusは”started”状態になります。

{"service_id":"{YOUR_SERVICE_ID}","session_id":"017c25ec12c00a304474a999","status":"started"}

実際に音声認識処理が始まるとstatusは”processing”状態になります。受け付けた音声のサイズや、MD5チェックサムを使って送信した音声が正しく処理されているかどうかを確認することができます。”processing”状態は、音声の長さによって前後します。

{'audio_md5': '40f59fe5fc7745c33b33af44be43f6ad', 'audio_size': 306980, 'service_id': '{YOUR_SERVICE_ID}', 'session_id': '017c25ec12c00a304474a999', 'status': 'processing'}

音声認識が完了すると、statusは”completed”状態となります。このとき、resultsに音声認識結果を得ることができます。

{"audio_md5":"40f59fe5fc7745c33b33af44be43f6ad","audio_size":306980,"results":{"code":"","message":"","results":[{"confidence":1.0,"endtime":9591,"rulename":"","starttime":0,"tags":[],"text":"アドバンスト・メディアは、人と機械等の自然なコミュニケーションを実現し、豊かな未来を創造していくことをめざします。","tokens":[{"confidence":1.0,"endtime":1578,"spoken":"あどばんすとめでぃあ","starttime":570,"written":"アドバンスト・メディア"},{"confidence":1.0,"endtime":1850,"spoken":"は","starttime":1578,"written":"は"},{"confidence":0.77,"endtime":2010,"spoken":"_","starttime":1850,"written":"、"},{"confidence":1.0,"endtime":2314,"spoken":"ひと","starttime":2010,"written":"人"},{"confidence":1.0,"endtime":2426,"spoken":"と","starttime":2314,"written":"と"},{"confidence":1.0,"endtime":2826,"spoken":"きかい","starttime":2426,"written":"機械"},{"confidence":0.76,"endtime":2922,"spoken":"とう","starttime":2826,"written":"等"},{"confidence":1.0,"endtime":3082,"spoken":"の","starttime":2922,"written":"の"},{"confidence":1.0,"endtime":3434,"spoken":"しぜん","starttime":3082,"written":"自然"},{"confidence":1.0,"endtime":3530,"spoken":"な","starttime":3434,"written":"な"},{"confidence":1.0,"endtime":4362,"spoken":"こみゅにけーしょん","starttime":3530,"written":"コミュニケーション"},{"confidence":1.0,"endtime":4442,"spoken":"を","starttime":4362,"written":"を"},{"confidence":1.0,"endtime":4906,"spoken":"じつげん","starttime":4442,"written":"実現"},{"confidence":1.0,"endtime":5242,"spoken":"し","starttime":4906,"written":"し"},{"confidence":0.83,"endtime":5642,"spoken":"_","starttime":5242,"written":"、"},{"confidence":1.0,"endtime":5978,"spoken":"ゆたか","starttime":5642,"written":"豊か"},{"confidence":1.0,"endtime":6090,"spoken":"な","starttime":5978,"written":"な"},{"confidence":1.0,"endtime":6490,"spoken":"みらい","starttime":6090,"written":"未来"},{"confidence":1.0,"endtime":6554,"spoken":"を","starttime":6490,"written":"を"},{"confidence":0.92,"endtime":7034,"spoken":"そうぞう","starttime":6554,"written":"創造"},{"confidence":1.0,"endtime":7210,"spoken":"して","starttime":7034,"written":"して"},{"confidence":1.0,"endtime":7402,"spoken":"いく","starttime":7210,"written":"いく"},{"confidence":0.8,"endtime":7674,"spoken":"こと","starttime":7402,"written":"こと"},{"confidence":1.0,"endtime":7706,"spoken":"を","starttime":7674,"written":"を"},{"confidence":0.78,"endtime":7962,"spoken":"めざ","starttime":7706,"written":"めざ"},{"confidence":0.78,"endtime":8490,"spoken":"します","starttime":7962,"written":"します"},{"confidence":0.83,"endtime":8778,"spoken":"_","starttime":8490,"written":"。"}]}],"segments":[{"code":"","message":"","results":[{"confidence":1.0,"endtime":8778,"rulename":"","starttime":250,"tags":[],"text":"アドバンスト・メディアは、人と機械等の自然なコミュニケーションを実現し、豊かな未来を創造していくことをめざします。","tokens":[{"confidence":1.0,"endtime":1578,"spoken":"あどばんすとめでぃあ","starttime":570,"written":"アドバンスト・メディア"},{"confidence":1.0,"endtime":1850,"spoken":"は","starttime":1578,"written":"は"},{"confidence":0.77,"endtime":2010,"spoken":"_","starttime":1850,"written":"、"},{"confidence":1.0,"endtime":2314,"spoken":"ひと","starttime":2010,"written":"人"},{"confidence":1.0,"endtime":2426,"spoken":"と","starttime":2314,"written":"と"},{"confidence":1.0,"endtime":2826,"spoken":"きかい","starttime":2426,"written":"機械"},{"confidence":0.76,"endtime":2922,"spoken":"とう","starttime":2826,"written":"等"},{"confidence":1.0,"endtime":3082,"spoken":"の","starttime":2922,"written":"の"},{"confidence":1.0,"endtime":3434,"spoken":"しぜん","starttime":3082,"written":"自然"},{"confidence":1.0,"endtime":3530,"spoken":"な","starttime":3434,"written":"な"},{"confidence":1.0,"endtime":4362,"spoken":"こみゅにけーしょん","starttime":3530,"written":"コミュニケーション"},{"confidence":1.0,"endtime":4442,"spoken":"を","starttime":4362,"written":"を"},{"confidence":1.0,"endtime":4906,"spoken":"じつげん","starttime":4442,"written":"実現"},{"confidence":1.0,"endtime":5242,"spoken":"し","starttime":4906,"written":"し"},{"confidence":0.83,"endtime":5642,"spoken":"_","starttime":5242,"written":"、"},{"confidence":1.0,"endtime":5978,"spoken":"ゆたか","starttime":5642,"written":"豊か"},{"confidence":1.0,"endtime":6090,"spoken":"な","starttime":5978,"written":"な"},{"confidence":1.0,"endtime":6490,"spoken":"みらい","starttime":6090,"written":"未来"},{"confidence":1.0,"endtime":6554,"spoken":"を","starttime":6490,"written":"を"},{"confidence":0.92,"endtime":7034,"spoken":"そうぞう","starttime":6554,"written":"創造"},{"confidence":1.0,"endtime":7210,"spoken":"して","starttime":7034,"written":"して"},{"confidence":1.0,"endtime":7402,"spoken":"いく","starttime":7210,"written":"いく"},{"confidence":0.8,"endtime":7674,"spoken":"こと","starttime":7402,"written":"こと"},{"confidence":1.0,"endtime":7706,"spoken":"を","starttime":7674,"written":"を"},{"confidence":0.78,"endtime":7962,"spoken":"めざ","starttime":7706,"written":"めざ"},{"confidence":0.78,"endtime":8490,"spoken":"します","starttime":7962,"written":"します"},{"confidence":0.83,"endtime":8778,"spoken":"_","starttime":8490,"written":"。"}]}],"text":"アドバンスト・メディアは、人と機械等の自然なコミュニケーションを実現し、豊かな未来を創造していくことをめざします。"}],"text":"アドバンスト・メディアは、人と機械等の自然なコミュニケーションを実現し、豊かな未来を創造していくことをめざします。","utteranceid":"20210927/06/017c25ed38cc0a30425239d0_20210927_062436[nolog]"},"service_id":"{YOUR_SERVICE_ID}","session_id":"017c25ec12c00a304474a999","status":"completed"}

音声認識結果のフォーマットについては、「I/F仕様 非同期HTTP音声認識API 詳細」のレスポンス」を参照してください。