values me ,values and ambient中文翻译

2022-10-08 10:22:33 热点扫描

　　语音识别是家庭自动化、人工智能和其他应用中的重要特征。本文介绍如何使用Python语言识别库。　　

　　所需的组件 　　

　　必须安装以下组件：　　

　　1)Python语音识别模块：　　

　　PyAudio:对Linux用户使用以下命令。　　

　　o su dapt-get install python-pyaudio python 3-py audio如果存储库中的版本太旧，使用以下命令安装py audio 　　

　　O sudapt-get安装端口audio 19-dev python-all-dev Python3-all-dev sudo pip为python 3安装py audio，请使用pip3而不是pip。　　

　　Windows用户可以通过在终端中执行以下命令来安装pyaudio 　　

　　pip安装音频使用麦克风的语音输入和语音到文本的翻译 　　

　　1)配置麦克风（用于外部麦克风）:建议在程序执行时指定麦克风，以避免毛刺。　　

　　在终端中键入lsusb。将显示已连接设备的列表。麦克风名称如下　　

　　USB设备0x46d 33600 x 825: audio(HW 33601，0)请注意此注释，因为它将在程序中使用。　　

　　2)设置块大小（Chunk Size）:这基本上包括指定我们一次要读取多少字节的数据。通常，该值以2的幂指定，如1024或2048。　　

　　3)设置采样率（Sampling Rate）:采样率定义了处理记录值的频率。　　

　　4)将设备ID设置为选定的麦克风:在这一步中，我们指定麦克风的设备ID，以便在有多个麦克风时避免歧义。从某种意义上来说，这对调试也是有帮助的。当运行程序时，我们将知道指定的麦克风是否已被识别。在编程期间，我们指定参数device_id。如果麦克风没有被识别，程序会提示找不到device_id。　　

　　5)允许调整环境噪声:由于周围的噪音会发生变化，我们不得不要求程序等待一秒钟来调整录音的能量阈值，以便根据外部噪音水平进行调整。　　

　　6)语音到文本的翻译:这是在谷歌语音识别的帮助下完成的。这需要有效的互联网连接才能工作。不过也有一些离线识别系统，比如PocketSphinx，但是安装过程非常严格，需要几个依赖项。谷歌语音识别是最容易使用的。　　

　　上述步骤实现如下：　　

　　#用于语音识别的Python 2.x程序将speech_recognition导入为sr #输入您找到的usb麦克风的名称#使用lsusb #以下名称仅用作示例mic_name='USB设备0x46d:0x825:音频(hw:1，0)' #采样率是记录值的频率sample_rate=48000#Chunk就像一个缓冲区。它在这里存储2048个样本(数据字节)#。#建议使用2的幂，例如1024或2048 chunk _ size=2048 #初始化识别器r=sr.Recognizer() #生成所有声卡/麦克风的列表mic _ list=Sr . microphone . list _ microphone _ names()#以下循环旨在设置我们特别想要使用的麦克风的设备ID，以避免歧义。对于I，在enumerate(mic_list):中的麦克风名称if麦克风名称==麦克风名称：设备id=i #使用麦克风作为输入源。在这里，我们还指定#如果麦克风#不工作，将弹出一个错误消息　　

g "device_id undefined" with sr.Microphone(device_index = device_id, sample_rate = sample_rate, chunk_size = chunk_size) as source: #wait for a second to let the recognizer adjust the #energy threshold based on the surrounding noise level r.adjust_for_ambient_noise(source) print "Say Something" #listens for the user's input audio = r.listen(source) try: text = r.recognize_google(audio) print "you said: " + text #error occurs when google could not understand what was said except sr.UnknownValueError: print("Google Speech Recognition could not understand audio") except sr.RequestError as e: print("Could not request results from Google Speech Recognition service; {0}".format(e)) 将音频文件转录为文本

如果我们有一个要转换为文本的音频文件，我们只需要将源替换为音频文件而不是麦克风。

为了方便起见，请将音频文件和程序放在同一文件夹中。这适用于FLAC文件的WAV、AIFF。

实现如下所示

#Python 2.x program to transcribe an Audio file import speech_recognition as sr AUDIO_FILE = ("example.wav") # use the audio file as the audio source r = sr.Recognizer() with sr.AudioFile(AUDIO_FILE) as source: #reads the audio file. Here we use record instead of #listen audio = r.record(source) try: print("The audio file contains: " + r.recognize_google(audio)) except sr.UnknownValueError: print("Google Speech Recognition could not understand audio") except sr.RequestError as e: print("Could not request results from Google Speech Recognition service; {0}".format(e)) 故障排除

通常会遇到以下问题

1）麦克风静音：这导致无法接收输入。要检查这一点，可以使用alsamixer。

它可以使用

sudo apt-get install libasound2 alsa-utils alsa-oss键入amixer。输出将看起来像这样

Simple mixer control 'Master', 0 Capabilities: pvolume pswitch pswitch-joined Playback channels: Front Left - Front Right Limits: Playback 0 - 65536 Mono: Front Left: Playback 41855 <64%> Front Right: Playback 65536 <100%> Simple mixer control 'Capture', 0 Capabilities: cvolume cswitch cswitch-joined Capture channels: Front Left - Front Right Limits: Capture 0 - 65536 Front Left: Capture 0 <0%> #switched off Front Right: Capture 0 <0%> 如您所见，捕获设备当前已关闭。要打开它，请键入alsamixer。

如您在第一张图片中所看到的，它正在显示我们的播放设备。按F4键切换到“捕获设备”。