C#开发语音识别程序

概述

语音识别是当前比较热门的领域之一，它可以应用在语音助手、音频转写等领域。本文主要介绍如何使用C#开发语音识别程序。

准备工作

在使用C#开发语音识别程序之前，我们需要安装语音识别的开发库。我们可以使用Microsoft Speech Platform SDK 11来实现对语音的识别，这个库在Windows 7以及之后的版本中默认已经安装了。如果没有安装，则需要从Microsoft官网上下载并安装。

步骤

1.添加引用

打开Visual Studio，新建一个控制台应用程序，并添加对Microsoft.Speech.dll的引用。方法是右键“应用程序名称” -> “添加” -> “引用” -> “浏览”，找到Microsoft.Speech.dll文件，添加即可。

2.配置识别引擎

在程序中添加如下代码，以配置识别引擎

using Microsoft.Speech.Recognition;
using Microsoft.Speech.Recognition.SrgsGrammar;
using System.Collections.Generic;

public class SpeechRecognition
{
    private SpeechRecognitionEngine _recognizer;
    public SpeechRecognition()
    {
        var options = new SpeechRecognitionEngineOptions
        {
            SynchronousGrammarCompilation = true
        };
        _recognizer = new SpeechRecognitionEngine(options);
        AddCommands();
    }

    public void AddCommands()
    {
        var command = new GrammarBuilder(new Choices(new string[] { "hello", "how are you" }));
        var grammar = new Grammar(command);
        _recognizer.LoadGrammar(grammar);
        _recognizer.SpeechRecognized += Recognizer_SpeechRecognized;
    }

    public void Start()
    {
        _recognizer.SetInputToDefaultAudioDevice();
        _recognizer.RecognizeAsync(RecognizeMode.Multiple);
        Console.ReadLine();
    }

    private void Recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
    {
        Console.WriteLine("Recognized: " + e.Result.Text.ToString());
    }
}

以上代码中，我们创建了一个名为SpeechRecognition的类，并在构造函数中初始化了一个名为_recognizer的变量对识别引擎进行配置，通过AddCommands方法添加了几个命令，然后通过Start方法启动识别。当用户说出识别的命令时，会在控制台输出“Recognized:”和识别到的命令文本。

3.运行程序

在程序的Main方法中添加如下代码：

var speak = new SpeechRecognition();
speak.Start();

运行程序后，试着对着麦克风说“Hello”或者“How are you”，可以看到程序能够成功识别出语音并输出识别的结果。

示例说明

示例一：简单的计算器

我们可以通过识别语音命令来实现简单的计算器功能。例如，如果用户说：“What is 2 plus 2”，程序可以识别出命令，并输出结果：“4”。

public void AddCommands()
{
    var builder = new SrgsDocument();
    var items = new List<string> { "plus", "minus", "times", "divided by" };
    var operations = new GrammarBuilder(new Choices(items));
    var digits = new SrgsRule("digit", new SrgsRule[] {
        new SrgsAny("1", "2", "3", "4", "5", "6", "7", "8", "9", "0"),
        new SrgsOneOf(new string[]{ "ten", "twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"}),
        new SrgsOneOf(new string[]{ "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen", "seventeen", "eighteen", "nineteen"})
    });
    var digitList = new SrgsZeroOrMore(digits);
    var a = new SrgsIdentifier("a", digitList);
    var b = new SrgsIdentifier("b", digitList);

    var opPlus = new SrgsRule("opPlus", new SrgsItem[]{operations, new SrgsSemanticInterpretationTag("out=\"+\";")});
    var opMinus = new SrgsRule("opMinus", new SrgsItem[]{operations, new SrgsSemanticInterpretationTag("out=\"-\";")});
    var opTimes = new SrgsRule("opTimes", new SrgsItem[]{operations, new SrgsSemanticInterpretationTag("out=\"*\";")});
    var opDividedBy = new SrgsRule("opDividedBy", new SrgsItem[]{operations, new SrgsSemanticInterpretationTag("out=\"/\";")});

    builder.Rules.Add(digits);
    builder.Rules.Add(opPlus);
    builder.Rules.Add(opMinus);
    builder.Rules.Add(opTimes);
    builder.Rules.Add(opDividedBy);

    var main = new SrgsRule(
        "main",
        new SrgsItem[]
        {
            new SrgsSemanticInterpretationTag("out=\"result = \";"),
            new SrgsOneOf(new SrgsItem[] { a, b }),
            new SrgsSemanticInterpretationTag("out+=\" \" + a.value + \" \" + out + \" \" + b.value + \";\";")
        }
    );
    builder.Rules.Add(main);

    builder.Root = main;
    _recognizer.LoadGrammar(new Grammar(builder));
}

以上代码中，我们通过一个GrammarBuilder对象来定义识别语音命令的语法规则。其中包括四个操作符，可以和数字或者纯数字组合成算式。使用SemantcInterpretationTag对命令进行解析并计算结果。

示例二：实时语音翻译

我们还可以使用语音识别来实现实时语音翻译。例如，当用户说“Hello，我是中国人”，程序可以将中文翻译成英文，并输出结果：“Hello, I am Chinese”。

public void AddCommands()
{
     var requestURI = "https://api.cognitive.microsofttranslator.com/translate?api-version=3.0&to=en";
     var request = (HttpWebRequest)WebRequest.Create(requestURI);
     request.Headers.Add("Ocp-Apim-Subscription-Key", "<YOUR_KEY_HERE>");

     var command = new GrammarBuilder();
     command.AppendDictation();
     var grammar = new Grammar(command);

     _recognizer.LoadGrammar(grammar);
     _recognizer.SpeechRecognized += (s, e) => {
        if (e.Result.Text != null)
        {
            var query = e.Result.Text;
            var encodeQuery = HttpUtility.UrlEncode(query);
            var postData = "[{\"Text\": \"" + encodeQuery + "\"}]";
            request.ContentType = "application/json";
            request.Method = "POST";
            var data = Encoding.UTF8.GetBytes(postData);
            using (var stream = request.GetRequestStream())
            {
                stream.Write(data, 0, data.Length);
            }
            var response = (HttpWebResponse)request.GetResponse();
            var responseString = new StreamReader(response.GetResponseStream()).ReadToEnd();
            var translatedText = Regex.Match(responseString, "\"text\": ?\"([^\"]+)\"", RegexOptions.IgnoreCase).Groups[1].Value;
            Console.WriteLine(translatedText);
        }
     };
}

在以上代码中，我们首先定义一个requestURI的变量包含翻译的API地址，然后构造一个命令语法规则，当语音被识别后就发送翻译的请求，然后将得到的翻译结果输出到控制台。

结论

通过上述步骤，我们学习了如何使用C#来开发语音识别程序。在实际应用中，我们可以根据不同的需求来定义语法规则和实现对命令的解析。同时，我们还看到了如何使用语音识别来实现实时语音翻译、计算器等功能。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：c# 开发语音识别程序 - Python技术站

c# 开发语音识别程序