python调用百度语音识别api

一、前置条件

在使用百度语音识别API之前，需要进行以下操作：

1.在百度AI开放平台注册账户，并创建应用，获取API Key和Secret Key。

2.安装Python开发环境，并安装requests库和pyaudio库。requests库用于发送HTTP请求，pyaudio库用于录音。在命令行输入以下命令进行安装：

pip install requests
pip install pyaudio

3.在本地根目录下创建一个config.py文件，用于存储API Key和Secret Key。并将这个文件加入到.gitignore中，避免将密钥上传到Git仓库。

config.py文件的内容如下：

APP_ID = 'your_app_id'
API_KEY = 'your_api_key'
SECRET_KEY = 'your_secret_key'

二、代码实现

1.使用百度语音识别API进行在线识别

import requests
import json
import base64
from config import APP_ID, API_KEY, SECRET_KEY

# API接口地址
url = 'http://vop.baidu.com/server_api'

# 读取语音文件，并进行Base64编码
with open('audio.wav', 'rb') as f:
    speech_data = base64.b64encode(f.read()).decode('utf-8')

# 构建请求参数
data = {
    'format': 'wav',
    'rate': 16000,
    'channel': 1,
    'cuid': '123456',
    'token': '自己的token',
    'lan': 'zh',
    'speech': speech_data,
    'len': len(speech_data)
}
headers = {
    'Content-Type': 'application/json'
}

# 认证并发送请求
access_token_url = 'https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id={0}&' \
    'client_secret={1}'.format(API_KEY, SECRET_KEY)
response = requests.post(access_token_url)
access_token = json.loads(response.text)['access_token']
data['token'] = access_token
response = requests.post(url, data=json.dumps(data), headers=headers)

# 解析识别结果
if 'result' in response.text:
    result = json.loads(response.text)['result'][0]
    print('识别结果：', result)
else:
    print('识别失败')

该代码使用requests库发送POST请求，请求体为JSON格式。其中speech_data为经过Base64编码的语音文件内容。请求参数中的token需要先通过API Key和Secret Key获取，方法如下：

access_token_url = 'https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id={0}&' \
    'client_secret={1}'.format(API_KEY, SECRET_KEY)
response = requests.post(access_token_url)
access_token = json.loads(response.text)['access_token']

2.使用本地文件进行离线语音识别

import requests
import json
from config import APP_ID, API_KEY, SECRET_KEY

# API接口地址
url = 'http://vop.baidu.com/server_api'

# 设置API Key和Secret Key
auth_params = {
    'grant_type': 'client_credentials',
    'client_id': API_KEY,
    'client_secret': SECRET_KEY
}
response = requests.get('https://openapi.baidu.com/oauth/2.0/token', params=auth_params)
access_token = json.loads(response.text)['access_token']

# 读取语音文件
with open('audio.wav', 'rb') as f:
    audio_file = f.read()

# 构建请求参数
data = {
    'format': 'wav',
    'rate': 16000,
    'channel': 1,
    'cuid': '123456',
    'token': access_token,
    'lan': 'zh',
    'len': len(audio_file),
    'speech': base64.b64encode(audio_file).decode('utf-8')
}
headers = {'Content-Type': 'application/json'}

# 发送请求
response = requests.post(url, data=json.dumps(data), headers=headers)
result = json.loads(response.text)
if 'result' in result:
    print(result['result'][0])
else:
    print('识别失败')

在离线语音识别中，需要先使用API Key和Secret Key获取access_token。离线语音识别的请求与在线语音识别的请求除了需要带上access_token之外，其他请求参数都相同。离线语音识别的语音文件大小不能超过10M。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：python调用百度语音识别api - Python技术站