用于管理语音命令生命周期的 Python 库。
项目描述
语音命令生命周期
用于管理语音命令生命周期的 Python 库。使用 Alexa 语音服务很有用。
安装
pip install command_lifecycle
唤醒词检测器
唤醒词是触发代码生效的特定词。它允许您的代码处于空闲状态,直到说出特定的单词。
音频生命周期使用snowboy来确定是否说出了唤醒词。需要先安装该库。
一旦你编译了 snowboy,将编译后的snowboy文件夹复制到项目的顶层。默认情况下,文件夹结构应为:
.
├── ...
├── snowboy
| ├── snowboy-detect-swig.cc
| ├── snowboydetect.py
| └── resources
| ├── alexa.umdl
| └── common.res
└── ...
如果默认结构不适合您的需要,可以自定义唤醒词检测器。
用法
您应该通过重复调用将稳定的音频流发送到生命周期lifecycle.extend_audio(some_audio_bytes)。如果说出诸如“Alexa”(默认)或“ok, Google”之类的唤醒词,则调用该唤醒词handle_command_started。handle_command_finised然后在唤醒词之后的命令音频完成后调用。
麦克风音频
import pyaudio
import command_lifecycle
class AudioLifecycle(command_lifecycle.BaseAudioLifecycle):
def handle_command_started(self, wakeword_name):
super().handle_command_started(wakeword_name)
print(f'The audio contained {wakeword_name}!')
def handle_command_finised(self):
super().handle_command_finised()
print('The command has finished')
lifecycle = AudioLifecycle()
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True)
try:
print('listening. Start by saying "Alexa". Press CTRL + C to exit.')
while True:
lifecycle.extend_audio(stream.read(1024))
finally:
stream.stop_stream()
stream.close()
p.terminate()
文件音频
import wave
import command_lifecycle
class AudioLifecycle(command_lifecycle.BaseAudioLifecycle):
def handle_command_started(self, wakeword_name):
super().handle_command_started(wakeword_name)
print(f'The audio contained {wakeword_name}!')
lifecycle = AudioLifecycle()
with wave.open('./tests/resources/alexa_what_time_is_it.wav', 'rb') as f:
while f.tell() < f.getnframes():
lifecycle.extend_audio(f.readframes(1024))
print('The command has finished')
与 Alexa 一起使用
command_lifecycle对于与语音服务交互很有用。生命周期一直等到发出唤醒词,然后开始将音频命令流式传输到语音服务(使用Alexa Voice Service Client),然后对响应执行一些有用的操作:
from avs_client.avs_client.client import AlexaVoiceServiceClient
import pyaudio
import command_lifecycle
class AudioLifecycle(command_lifecycle.BaseAudioLifecycle):
alexa_client = AlexaVoiceServiceClient(
client_id='my-client-id'
secret='my-secret',
refresh_token='my-refresh-token',
)
def __init__(self):
self.alexa_client.connect()
super().__init__()
def handle_command_started(self, wakeword_name):
super().handle_command_started(wakeword_name)
audio_file = command_lifecycle.to_audio_file()
for directive in self.alexa_client.send_audio_file(audio_file):
# do something with the AVS audio response, e.g., play it.
lifecycle = AudioLifecycle()
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True)
try:
print('listening. Start by saying "Alexa". Press CTRL + C to exit.')
while True:
lifecycle.extend_audio(stream.read(1024))
finally:
stream.stop_stream()
stream.close()
p.terminate()
定制
唤醒词
默认唤醒词是“Alexa”。这可以通过子分类来改变command_lifecycle.wakeword.SnowboyWakewordDetector:
from command_lifecycle import wakeword
class MySnowboyWakewordDetector(wakeword.SnowboyWakewordDetector):
decoder_models = [
{
'name': 'CUSTOM',
'model': b'path/to/custom-wakeword-model.umdl'
'sensitivity': b'0.5',
}
]
class AudioLifecycle(lifecycle.BaseAudioLifecycle):
audio_detector_class = MySnowboyWakewordDetector
def handle_command_started(self, wakeword_name):
super().handle_command_started(wakeword_name)
print(f'The audio contained the {wakeword_name}!')
def handle_command_finised(self):
super().handle_command_finised()
print('The command has finished')
lifecycle = AudioLifecycle()
# now load the audio into lifecycle
有关创建自定义唤醒词模型的步骤,请参阅Snowboy 文档。
多个唤醒词
可能需要为不同的唤醒词触发不同的行为。为此,请使用多个项目decoder_models:
from command_lifecycle import wakeword
class MyMultipleWakewordDetector(wakeword.SnowboyWakewordDetector):
GOOGLE = 'GOOGLE'
decoder_models = wakeword.SnowboyWakewordDetector.decoder_models + [
{
'name': GOOGLE,
'model': b'path/to/okgoogle.umdl',
'sensitivity': b'0.5',
}
]
class AudioLifecycle(lifecycle.BaseAudioLifecycle):
audio_detector_class = MyMultipleWakewordDetector
def handle_command_started(self, wakeword_name):
if wakeword_name == self.audio_detector.ALEXA:
print('Alexa standing by')
elif wakeword_name == self.audio_detector.GOOGLE:
print('Google at your service')
super().handle_command_started(wakeword_name)
您可以从这里下载唤醒词。
唤醒词检测器
Snowboy 是默认的唤醒词检测器。其他唤醒词检测器可以通过子分类command_lifecycle.wakeword.BaseWakewordDetector和设置wakeword_detector_class到您的自定义类来使用:
import wave
from command_lifecycle import lifecycle, wakeword
class MyCustomWakewordDetector(wakeword.BaseWakewordDetector):
import_error_message = 'Cannot import wakeword library!'
wakeword_library_import_path = 'path.to.wakeword.Library'
def was_wakeword_uttered(self, buffer):
# use the library to check if the audio in the buffer has the wakeword.
# not `buffer.get()` returns the audio inside the buffer.
...
def is_talking(self, buffer):
# use the library to check if the audio in the buffer has audible words
# not `buffer.get()` returns the audio inside the buffer.
...
class AudioLifecycle(lifecycle.BaseAudioLifecycle):
audio_detector_class = MyCustomWakewordDetector
...
lifecycle = AudioLifecycle()
# now load the audio into lifecycle
处理输入数据
支持三种输入数据格式:
| 转换器 | 笔记 |
|---|---|
NoOperationConverter |
默认输入数据已经是 wav 字节。 |
WavIntSamplestoWavConverter |
输入数据是整数列表。 |
WebAudioToWavConverter |
输入数据是 Web 浏览器生成的浮点数列表。 |
通过设置生命周期的audio_converter_class:
from command_lifecycle.helpers import WebAudioToWavConverter
class AudioLifecycle(lifecycle.BaseAudioLifecycle):
audio_converter_class = WebAudioToWavConverter
期待更慢或更快的命令
发出音频命令的人可能会在完成命令之前花点时间整理一下自己的想法。这种沉默可以解释为命令结束,导致handle_command_finised过早调用。
为了避免这种情况,生命周期在命令生命周期超时之前容忍命令中的一些静默。这种沉默可能发生在命令的开头或中间。handle_command_finised请注意,这样做的副作用是在此人停止说话和被呼叫之间会有停顿。
要更改此默认行为timeout_manager_class,可以更改。可用的超时管理器是:
| 超时管理器 | 笔记 |
|---|---|
ShortTimeoutManager |
允许一秒钟的沉默。 |
MediumTimeoutManager |
默认允许 2 秒的静默。 |
LongTimeoutManager |
允许三秒钟的沉默。 |
要使自定义超时管理器创建一个子类command_lifecycle.timeout.BaseTimeoutManager:
import wave
from command_lifecycle import timeout, wakeword
class MyCustomTimeoutManager(timeout.BaseTimeoutManager):
allowed_silent_seconds = 4
class AudioLifecycle(lifecycle.BaseAudioLifecycle):
timeout_manager_class = MyCustomTimeoutManager
单元测试
要运行单元测试,请调用以下命令:
make test_requirements
make test
版本控制
我们使用SemVer进行版本控制。有关可用版本,请参阅PyPI。
其他的项目
该库由alexa-browser-client使用,它允许您从浏览器与 Alexa 对话。