Tech blog
  • HOME
  • Blog
  • How to Choose the Right Microphone - Speech recognition rate can change depending on the microphone!

How to Choose the Right Microphone - Speech recognition rate can change depending on the microphone!

Published: 2024.04.25 Last updated: 2025.03.04
f:id:amivoice_techblog:20210115094116p:plain
Shogo Ando

Hello everyone.

The choice of microphone is an important factor in speech recognition. There are many different types of microphones, so it is very important to choose the best one depending on the purpose and environment of use.

If you prioritize speech recognition accuracy, it's best to choose a microphone that's placed near your mouth. Having a microphone close to your mouth is like speaking close to your ear. This reduces the influence of ambient noise and allows for clearer voice pickup. This article will focus on these points and explain how to choose a microphone that will improve speech recognition accuracy.

*This article is the second installment in a series that explains five tips for improving speech recognition accuracy. For the first tip, please see "Using Appropriate Speech Patterns".

This article is recommended for people like this

  • Those who want to improve the recognition rate of speech recognition
  • Those who want to create a service using speech recognition or are concerned about the recognition rate

"Microphones used near the mouth" are easier to recognize voices

Below are four illustrations showing examples of microphone use. I've rated each one with ◎, 〇, or △ in terms of ease of speech recognition, but this is purely my subjective opinion, so please use it for reference only.

The two on the left, a headset and a handheld microphone, are both used quite close to the mouth, so are highly recommended for speech recognition. The key point with a headset is that the microphone is always close to the mouth when worn, so it remains in a stable position even if you move a little. A handheld microphone is also recommended, but as you speak, the microphone may gradually move away from your mouth, making it difficult to pick up sound (something you need to be careful of), so I've downgraded it to a rating of 1.

The two on the right are examples where speech recognition is a little more difficult. In the second case from the right, the illustration shows the device on the table about 30 to 50 cm away, which makes it very susceptible to picking up surrounding noise. In the case of the lecture on the far right, the speaker is even further away, so the speaker's voice itself is attenuated even more and becomes weaker. Also, if speakers are used in the venue, the sound will be louder, but the actual speaker's voice, the sound from the speakers, and the reflected sound may mix together and make it difficult to hear. When the sound source and microphone are far apart like this, speech recognition becomes more difficult.

Key Points

  • Ideally, only the sounds you want to recognize should be picked up by the microphone.
  • Having a microphone close to your mouth has the same effect as "speaking close to your ear"
  • The further the microphone is from your mouth, the more ambient noise will pick up.
  • If you're in a low-noise environment, you don't need to worry too much about the distance to the microphone.

[Experiment] How much does the volume of a sound change when the distance between the microphone and your mouth changes?

I conducted an experiment to see how the volume of sound changes depending on the distance between the microphone and the mouth. Using a headset, I played the same sound from my smartphone at a distance of 5cm and 30cm and recorded it. The image below shows what it looked like.

The results showed that the average waveform size (sound pressure) differed by about 4.8 times, as shown below.

In this experiment, only human voices were recorded in a quiet environment, so I think speech recognition will be possible without much impact at a distance of about 30 cm. However, if there is noise in the surrounding area, when the waveform becomes relatively small, such as at a distance of 30 cm, it will be easily drowned out by the noise, and speech recognition accuracy will be prone to decline. For this reason, it is important to bring your mouth closer to the microphone, especially in noisy environments.

Key Points

  • The closer you are to the microphone, the louder the sound you record, and the greater the effect of reducing surrounding noise.

*The reading audio was played on an iPhone in the room and recorded with a microphone.
*A unidirectional headset was used as the microphone, and the sound source was placed in front of the microphone's direction of directionality.
*The experiment was conducted without a person like the one in the illustration above, using only a sound source (iPhone) and a microphone (headset).
*The waveform magnitude values ​​for comparison were calculated using RMS (root mean square).
*The results may vary depending on various conditions.

"Directional microphones" make speech recognition easier

Microphones have a direction, which is called directionality. Basically, the microphone picks up the sound that comes from the front of the microphone most strongly, and the sound that comes from the side or back is picked up less strongly. By using this directionality appropriately, you can strongly pick up the sound from the direction of your mouth and weakly pick up the sound from other directions, achieving the same effect as speaking close to your ear as mentioned above.

(Reference) Types of directionality

There are several types of directivity, and we've listed three main ones. To look at it this way, the top of the circle in the diagram represents sound coming from directly in front, the right represents sound coming from 90 degrees to the right, and the bottom represents sound coming from directly behind. For example, the unidirectional microphone in the middle will pick up sound most strongly from directly in front (top of the diagram), and sound least strongly from directly behind (bottom of the diagram). Directionality will be described in microphone catalogs and spec sheets, so it would be a good idea to keep this in mind when looking.

Key Points

  • Microphones can have a directional characteristic.
  • Choosing a directional microphone can help reduce the amount of ambient noise picked up.
  • If you are in a low-noise environment, you don't need to worry too much about directionality.

[Experiment] How much does the volume of a sound change when the angle between the microphone and your mouth changes?

We conducted an experiment to see how the volume of sound changes depending on the angle between the microphone and the mouth when the microphone is directional. Using the same environment as the previous experiment, we fixed the distance between the microphone and the smartphone (sound source) at 5cm and changed the angle to 0° (front), 90° (directly to the side), and 180° (directly behind). The results showed that the average waveform volume (sound pressure) was approximately 1.4 times larger at 0° and 90°, and approximately 3.2 times larger at 0° and 180°.

It is unlikely that anyone would intentionally use the wrong direction of directionality as in this experiment, but by taking advantage of the directionality, it is possible to make the device less susceptible to the effects of surrounding noise and prevent a decrease in speech recognition accuracy.

Key Points

  • Proper use of directionality can reduce relative ambient noise

*The reading audio was played on an iPhone in the room and recorded with a microphone.
* A unidirectional headset was used as the microphone.
*The experiment was conducted without a person like the one in the illustration above, using only a sound source (iPhone) and a microphone (headset).
*The waveform magnitude values ​​for comparison were calculated using RMS (root mean square).
*The results may vary depending on various conditions. For reference only.

Other points to consider when choosing a microphone

In addition to the above, here are four other points to consider when choosing:

It may be best to avoid microphones that have special operating principles or involve some kind of powerful signal processing.

  • There are many sounds recorded with widely used microphones in the training data, and they are likely to be easily recognized.
  • For example, bone conduction microphones are resistant to noise, but the accuracy of speech recognition may be reduced.
  • In addition, microphones with strong signal processing (noise reduction and directionality) may be easier for humans to hear, but they can sometimes reduce the accuracy of speech recognition, so caution is required.

It doesn't have to be a luxury item costing hundreds of thousands of yen.

  • High-end microphones and audio devices are overkill for speech recognition.
  • Using a more affordable product rather than a high-end one may result in a sound closer to the training model, making it easier to recognize.
  • Be careful of "microphones that come with products where recording quality is not important" and "used/junk products"

Laptop microphones aren't necessarily bad, but they can be problematic

  • There is a possibility that the accuracy of speech recognition may be reduced due to factors such as "the distance from the mouth is too far" and "the driving noise of fans and the like is likely to be picked up."

PC sound cards can vary in quality (from experience)

  • Microphones that connect to a PC via a 3.5mm stereo mini plug may have inconsistent sound card quality, resulting in noise and problems with the volume not being able to be adjusted properly.
  • In this case, using a USB-connected microphone device will often result in more stable quality.

Conclusion: My personal recommendation for a speech recognition microphone is a USB headset.

To summarize what I've written so far, I recommend a microphone that fits properly near your mouth, is directional, and faces your mouth. Also, since a USB connection is less susceptible to variations in the quality of your PC's sound card, I recommend a "USB-connected headset."

However, this is only from the perspective of "improving speech recognition accuracy," so depending on the application, a microphone other than a headset may be more suitable. It is important to choose the most suitable microphone for your application from a wide variety of microphones to maximize speech recognition accuracy.

Next time, we will bring you an episode on how to use a microphone.

Person who wrote this article


  • Shogo Ando

    While researching speech recognition, I found a speech recognition company nearby and joined the company, where I continue to work to this day.

    My hobbies are traveling abroad, eating delicious food, and saunas.

    : @anpyan

Use API for Free