Snowman
0.1.0
|
Hotword detector class. More...
#include <snowboy-detect.h>
Public Member Functions | |
SnowboyDetect (const std::string &resource_filename, const std::string &model_str) | |
Default constructor. More... | |
bool | Reset () |
Resets the detection. More... | |
int | RunDetection (const std::string &data, bool is_end=false) |
Runs hotword detection. More... | |
int | RunDetection (const float *const data, const int array_length, bool is_end=false) |
Runs hotword detection on float samples. More... | |
int | RunDetection (const int16_t *const data, const int array_length, bool is_end=false) |
Runs hotword detection on int16_t samples. More... | |
int | RunDetection (const int32_t *const data, const int array_length, bool is_end=false) |
Runs hotword detection on int32_t samples. More... | |
void | SetSensitivity (const std::string &sensitivity_str) |
Sets the sensitivity string for the loaded hotwords. More... | |
void | SetHighSensitivity (const std::string &high_sensitivity_str) |
Sets the high sensitivity string for the loaded hotwords. More... | |
std::string | GetSensitivity () const |
Returns the sensitivity string for the current hotwords. More... | |
void | SetAudioGain (const float audio_gain) |
Apply a fixed gain to the input audio. More... | |
void | UpdateModel () const |
Writes the models to the model filenames specified in <model_str> in the constructor. More... | |
int | NumHotwords () const |
Returns the number of the loaded hotwords. More... | |
void | ApplyFrontend (const bool apply_frontend) |
Enable or disable audio frontend (NS & AGC). More... | |
int | SampleRate () const |
Returns the expected sample rate for audio provided to RunDetection(). More... | |
int | NumChannels () const |
Returns the expected number of channels for audio provided to RunDetection(). More... | |
int | BitsPerSample () const |
Returns the expected number of bits for audio provided to RunDetection(). More... | |
~SnowboyDetect () | |
Destructor. | |
Friends | |
class | testing::Inspector |
Hotword detector class.
Provides a high level, easy to use way to detect hotwords and do voice activity detection.
snowboy::SnowboyDetect::SnowboyDetect | ( | const std::string & | resource_filename, |
const std::string & | model_str | ||
) |
Default constructor.
Constructor that takes a resource file, and a list of hotword models which are separated by comma. In the case that more than one hotword exist in the provided models, RunDetection() will return the index of the hotword, if the corresponding hotword is triggered.
A personal model can only contain one hotword, but an universal model may contain multiple hotwords. It is your responsibility to figure out the index of the hotword. For example, if your model string is "foo.pmdl,bar.umdl", where foo.pmdl contains hotword x, bar.umdl has two hotwords y and z, the indices of different hotwords are as follows:
x 1
y 2
z 3
[in] | resource_filename | Filename of resource file. |
[in] | model_str | A string of multiple hotword models, separated by comma. |
void snowboy::SnowboyDetect::ApplyFrontend | ( | const bool | apply_frontend | ) |
Enable or disable audio frontend (NS & AGC).
If <apply_frontend> is true, then apply frontend audio processing; otherwise turns the audio processing off. Frontend audio processing includes algorithms such as automatic gain control (AGC), noise suppression (NS) and so on. Generally adding frontend audio processing helps the performance, but if the model is not trained with frontend audio processing, it may decrease the performance. The general rule of thumb is:
[in] | apply_frontend | New frontend state |
int snowboy::SnowboyDetect::BitsPerSample | ( | ) | const |
Returns the expected number of bits for audio provided to RunDetection().
std::string snowboy::SnowboyDetect::GetSensitivity | ( | ) | const |
Returns the sensitivity string for the current hotwords.
int snowboy::SnowboyDetect::NumChannels | ( | ) | const |
Returns the expected number of channels for audio provided to RunDetection().
int snowboy::SnowboyDetect::NumHotwords | ( | ) | const |
Returns the number of the loaded hotwords.
This helps you to figure the index of the hotwords.
bool snowboy::SnowboyDetect::Reset | ( | ) |
Resets the detection.
This class handles voice activity detection (VAD) internally. But if you have an external VAD, you should call Reset() whenever you see segment end from your VAD.
int snowboy::SnowboyDetect::RunDetection | ( | const float *const | data, |
const int | array_length, | ||
bool | is_end = false |
||
) |
Runs hotword detection on float samples.
If NumChannels() > 1, e.g., NumChannels() == 2, then the array is as follows:
d1c1, d1c2, d2c1, d2c2, d3c1, d3c2, ..., dNc1, dNc2
where d1c1 means data point 1 of channel 1.
[in] | data | Small chunk of data to be detected. |
[in] | array_length | Length of the data array in elements. |
[in] | is_end | Set it to true if it is the end of a utterance or file. |
int snowboy::SnowboyDetect::RunDetection | ( | const int16_t *const | data, |
const int | array_length, | ||
bool | is_end = false |
||
) |
Runs hotword detection on int16_t samples.
If NumChannels() > 1, e.g., NumChannels() == 2, then the array is as follows:
d1c1, d1c2, d2c1, d2c2, d3c1, d3c2, ..., dNc1, dNc2
where d1c1 means data point 1 of channel 1.
[in] | data | Small chunk of data to be detected. |
[in] | array_length | Length of the data array in elements. |
[in] | is_end | Set it to true if it is the end of a utterance or file. |
int snowboy::SnowboyDetect::RunDetection | ( | const int32_t *const | data, |
const int | array_length, | ||
bool | is_end = false |
||
) |
Runs hotword detection on int32_t samples.
If NumChannels() > 1, e.g., NumChannels() == 2, then the array is as follows:
d1c1, d1c2, d2c1, d2c2, d3c1, d3c2, ..., dNc1, dNc2
where d1c1 means data point 1 of channel 1.
[in] | data | Small chunk of data to be detected. |
[in] | array_length | Length of the data array in elements. |
[in] | is_end | Set it to true if it is the end of a utterance or file. |
int snowboy::SnowboyDetect::RunDetection | ( | const std::string & | data, |
bool | is_end = false |
||
) |
Runs hotword detection.
Supported audio format is WAVE (with linear PCM, 8-bits unsigned integer, 16-bits signed integer or 32-bits signed integer). See SampleRate(), NumChannels() and BitsPerSample() for the required sampling rate, number of channels and bits per sample values. You are supposed to provide a small chunk of data (e.g., 0.1 second) each time you call RunDetection(). Larger chunk usually leads to longer delay, but less CPU usage.
Definition of return values: -2: Silence. -1: Error. 0: No event. 1: Hotword 1 triggered. 2: Hotword 2 triggered. ...
[in] | data | Small chunk of data to be detected. See above for the supported data format. |
[in] | is_end | Set it to true if it is the end of a utterance or file. |
Code | Info |
---|---|
-2 | Silence. |
-1 | Error. |
0 | No event. |
1 | Hotword 1 triggered. |
2 | Hotword 2 triggered. |
... | Hotword n triggered. |
int snowboy::SnowboyDetect::SampleRate | ( | ) | const |
Returns the expected sample rate for audio provided to RunDetection().
void snowboy::SnowboyDetect::SetAudioGain | ( | const float | audio_gain | ) |
Apply a fixed gain to the input audio.
In case you have a very weak microphone, you can use this function to boost input audio level.
[in] | audio_gain | Gain to apply. A gain of 1 means no volume change. |
void snowboy::SnowboyDetect::SetHighSensitivity | ( | const std::string & | high_sensitivity_str | ) |
Sets the high sensitivity string for the loaded hotwords.
Similar to the sensitivity setting above. When set higher than the above sensitivity, the algorithm automatically chooses between the normal sensitivity set above and the higher sensitivity set here, to maximize the performance. By default, it is not set, which means the algorithm will stick with the sensitivity set above.
[in] | high_sensitivity_str | List of sensitivity values. |
void snowboy::SnowboyDetect::SetSensitivity | ( | const std::string & | sensitivity_str | ) |
Sets the sensitivity string for the loaded hotwords.
A <sensitivity_str> is a list of floating numbers between 0 and 1, and separated by comma. For example, if there are 3 loaded hotwords, your string should looks something like this:
0.4,0.5,0.8
Make sure you properly align the sensitivity value to the corresponding hotword.
[in] | sensitivity_str | List of sensitivity values. |
void snowboy::SnowboyDetect::UpdateModel | ( | ) | const |
Writes the models to the model filenames specified in <model_str> in the constructor.
This overwrites the original model with the latest parameter setting. You are supposed to call this function if you have updated the hotword sensitivities through SetSensitivity(), and you would like to store those values in the model as the default value.