C#开发语音识别程序
概述
语音识别是当前比较热门的领域之一,它可以应用在语音助手、音频转写等领域。本文主要介绍如何使用C#开发语音识别程序。
准备工作
在使用C#开发语音识别程序之前,我们需要安装语音识别的开发库。我们可以使用Microsoft Speech Platform SDK 11来实现对语音的识别,这个库在Windows 7以及之后的版本中默认已经安装了。如果没有安装,则需要从Microsoft官网上下载并安装。
步骤
1.添加引用
打开Visual Studio,新建一个控制台应用程序,并添加对Microsoft.Speech.dll的引用。方法是右键“应用程序名称” -> “添加” -> “引用” -> “浏览”,找到Microsoft.Speech.dll文件,添加即可。
2.配置识别引擎
在程序中添加如下代码,以配置识别引擎
using Microsoft.Speech.Recognition;
using Microsoft.Speech.Recognition.SrgsGrammar;
using System.Collections.Generic;
public class SpeechRecognition
{
private SpeechRecognitionEngine _recognizer;
public SpeechRecognition()
{
var options = new SpeechRecognitionEngineOptions
{
SynchronousGrammarCompilation = true
};
_recognizer = new SpeechRecognitionEngine(options);
AddCommands();
}
public void AddCommands()
{
var command = new GrammarBuilder(new Choices(new string[] { "hello", "how are you" }));
var grammar = new Grammar(command);
_recognizer.LoadGrammar(grammar);
_recognizer.SpeechRecognized += Recognizer_SpeechRecognized;
}
public void Start()
{
_recognizer.SetInputToDefaultAudioDevice();
_recognizer.RecognizeAsync(RecognizeMode.Multiple);
Console.ReadLine();
}
private void Recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
Console.WriteLine("Recognized: " + e.Result.Text.ToString());
}
}
以上代码中,我们创建了一个名为SpeechRecognition的类,并在构造函数中初始化了一个名为_recognizer的变量对识别引擎进行配置,通过AddCommands方法添加了几个命令,然后通过Start方法启动识别。当用户说出识别的命令时,会在控制台输出“Recognized:”和识别到的命令文本。
3.运行程序
在程序的Main方法中添加如下代码:
var speak = new SpeechRecognition();
speak.Start();
运行程序后,试着对着麦克风说“Hello”或者“How are you”,可以看到程序能够成功识别出语音并输出识别的结果。
示例说明
示例一:简单的计算器
我们可以通过识别语音命令来实现简单的计算器功能。例如,如果用户说:“What is 2 plus 2”,程序可以识别出命令,并输出结果:“4”。
public void AddCommands()
{
var builder = new SrgsDocument();
var items = new List<string> { "plus", "minus", "times", "divided by" };
var operations = new GrammarBuilder(new Choices(items));
var digits = new SrgsRule("digit", new SrgsRule[] {
new SrgsAny("1", "2", "3", "4", "5", "6", "7", "8", "9", "0"),
new SrgsOneOf(new string[]{ "ten", "twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"}),
new SrgsOneOf(new string[]{ "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen", "seventeen", "eighteen", "nineteen"})
});
var digitList = new SrgsZeroOrMore(digits);
var a = new SrgsIdentifier("a", digitList);
var b = new SrgsIdentifier("b", digitList);
var opPlus = new SrgsRule("opPlus", new SrgsItem[]{operations, new SrgsSemanticInterpretationTag("out=\"+\";")});
var opMinus = new SrgsRule("opMinus", new SrgsItem[]{operations, new SrgsSemanticInterpretationTag("out=\"-\";")});
var opTimes = new SrgsRule("opTimes", new SrgsItem[]{operations, new SrgsSemanticInterpretationTag("out=\"*\";")});
var opDividedBy = new SrgsRule("opDividedBy", new SrgsItem[]{operations, new SrgsSemanticInterpretationTag("out=\"/\";")});
builder.Rules.Add(digits);
builder.Rules.Add(opPlus);
builder.Rules.Add(opMinus);
builder.Rules.Add(opTimes);
builder.Rules.Add(opDividedBy);
var main = new SrgsRule(
"main",
new SrgsItem[]
{
new SrgsSemanticInterpretationTag("out=\"result = \";"),
new SrgsOneOf(new SrgsItem[] { a, b }),
new SrgsSemanticInterpretationTag("out+=\" \" + a.value + \" \" + out + \" \" + b.value + \";\";")
}
);
builder.Rules.Add(main);
builder.Root = main;
_recognizer.LoadGrammar(new Grammar(builder));
}
以上代码中,我们通过一个GrammarBuilder对象来定义识别语音命令的语法规则。其中包括四个操作符,可以和数字或者纯数字组合成算式。使用SemantcInterpretationTag对命令进行解析并计算结果。
示例二:实时语音翻译
我们还可以使用语音识别来实现实时语音翻译。例如,当用户说“Hello,我是中国人”,程序可以将中文翻译成英文,并输出结果:“Hello, I am Chinese”。
public void AddCommands()
{
var requestURI = "https://api.cognitive.microsofttranslator.com/translate?api-version=3.0&to=en";
var request = (HttpWebRequest)WebRequest.Create(requestURI);
request.Headers.Add("Ocp-Apim-Subscription-Key", "<YOUR_KEY_HERE>");
var command = new GrammarBuilder();
command.AppendDictation();
var grammar = new Grammar(command);
_recognizer.LoadGrammar(grammar);
_recognizer.SpeechRecognized += (s, e) => {
if (e.Result.Text != null)
{
var query = e.Result.Text;
var encodeQuery = HttpUtility.UrlEncode(query);
var postData = "[{\"Text\": \"" + encodeQuery + "\"}]";
request.ContentType = "application/json";
request.Method = "POST";
var data = Encoding.UTF8.GetBytes(postData);
using (var stream = request.GetRequestStream())
{
stream.Write(data, 0, data.Length);
}
var response = (HttpWebResponse)request.GetResponse();
var responseString = new StreamReader(response.GetResponseStream()).ReadToEnd();
var translatedText = Regex.Match(responseString, "\"text\": ?\"([^\"]+)\"", RegexOptions.IgnoreCase).Groups[1].Value;
Console.WriteLine(translatedText);
}
};
}
在以上代码中,我们首先定义一个requestURI的变量包含翻译的API地址,然后构造一个命令语法规则,当语音被识别后就发送翻译的请求,然后将得到的翻译结果输出到控制台。
结论
通过上述步骤,我们学习了如何使用C#来开发语音识别程序。在实际应用中,我们可以根据不同的需求来定义语法规则和实现对命令的解析。同时,我们还看到了如何使用语音识别来实现实时语音翻译、计算器等功能。
本站文章如无特殊说明,均为本站原创,如若转载,请注明出处:c# 开发语音识别程序 - Python技术站