对于连续识别/听写模式,您可以执行多种操作。您可以使用Android本身的谷歌语音识别功能,不建议连续识别(如上所述https://developer.android.com/reference/android/speech/SpeechRecognizer.html https://developer.android.com/reference/android/speech/SpeechRecognizer.html)
该 API 的实现可能会将音频流传输到远程
服务器来执行语音识别。因此这个 API 不是
旨在用于连续识别,这将消耗
大量的电池和带宽。
但如果您确实需要它,可以通过创建自己的类并继承 IRecognitionListener 来解决。 (我在xamarin-android上写的,语法与原生android非常相似)
public class CustomRecognizer : Java.Lang.Object, IRecognitionListener, TextToSpeech.IOnInitListener
{
private SpeechRecognizer _speech;
private Intent _speechIntent;
public string Words;
public CustomRecognizer(Context _context)
{
this._context = _context;
Words = "";
_speech = SpeechRecognizer.CreateSpeechRecognizer(this._context);
_speech.SetRecognitionListener(this);
_speechIntent = new Intent(RecognizerIntent.ActionRecognizeSpeech);
_speechIntent.PutExtra(RecognizerIntent.ExtraLanguageModel, RecognizerIntent.LanguageModelFreeForm);
_speechIntent.PutExtra(RecognizerIntent.ActionRecognizeSpeech, RecognizerIntent.ExtraPreferOffline);
_speechIntent.PutExtra(RecognizerIntent.ExtraSpeechInputCompleteSilenceLengthMillis, 1000);
_speechIntent.PutExtra(RecognizerIntent.ExtraSpeechInputPossiblyCompleteSilenceLengthMillis, 1000);
_speechIntent.PutExtra(RecognizerIntent.ExtraSpeechInputMinimumLengthMillis, 1500);
}
void startover()
{
_speech.Destroy();
_speech = SpeechRecognizer.CreateSpeechRecognizer(this._context);
_speech.SetRecognitionListener(this);
_speechIntent = new Intent(RecognizerIntent.ActionRecognizeSpeech);
_speechIntent.PutExtra(RecognizerIntent.ExtraSpeechInputCompleteSilenceLengthMillis, 1000);
_speechIntent.PutExtra(RecognizerIntent.ExtraSpeechInputPossiblyCompleteSilenceLengthMillis, 1000);
_speechIntent.PutExtra(RecognizerIntent.ExtraSpeechInputMinimumLengthMillis, 1500);
StartListening();
}
public void StartListening()
{
_speech.StartListening(_speechIntent);
}
public void StopListening()
{
_speech.StopListening();
}
public void OnBeginningOfSpeech()
{
}
public void OnBufferReceived(byte[] buffer)
{
}
public void OnEndOfSpeech()
{
}
public void OnError([GeneratedEnum] SpeechRecognizerError error)
{
Words = error.ToString();
startover();
}
public void OnEvent(int eventType, Bundle @params)
{
}
public void OnPartialResults(Bundle partialResults)
{
}
public void OnReadyForSpeech(Bundle @params)
{
}
public void OnResults(Bundle results)
{
var matches = results.GetStringArrayList(SpeechRecognizer.ResultsRecognition);
if (matches == null)
Words = "Null";
else
if (matches.Count != 0)
Words = matches[0];
else
Words = "";
//do anything you want for the result
}
startover();
}
public void OnRmsChanged(float rmsdB)
{
}
public void OnInit([GeneratedEnum] OperationResult status)
{
if (status == OperationResult.Error)
txtspeech.SetLanguage(Java.Util.Locale.Default);
}
}
在活动中调用它:
void StartRecording()
{
string rec = PackageManager.FeatureMicrophone;
if (rec != "android.hardware.microphone")
{
// no microphone, no recording. Disable the button and output an alert
Toast.MakeText(this, "NO MICROPHONE", ToastLength.Short);
}
else
{
//you can pass any object you want to connect to your recognizer here (I am passing the activity)
CustomRecognizer voice = new CustomRecognizer(this);
voice.StartListening();
}
}
不要忘记请求使用麦克风的许可!
解释 :
- 这将消除烦人的“点击开始录制”
-这将始终记录您调用 StartListening() 的那一刻,并且永远不会停止,因为每次完成录制时我总是调用 startover() 或 StartListening()
-这是一个非常糟糕的解决方法,因为在处理录音的那一刻,录音机在调用 StartListening() 之前不会获得任何声音输入(对此没有解决方法)
-谷歌识别对于语音命令来说并不是很好,因为语言模型是“[lang]句子”,所以你不能限制单词,谷歌总是会尝试做出一个“好句子”。
为了更好的结果和用户体验,我真的建议你使用Google Cloud API(但它必须是在线的,而且成本很高),第二个建议是CMUSphinx / PocketSphinx,它是开源的,可以做离线模式,但你必须做所有的事情手动
PocketSphinx 的优势:
- 您可以创建自己的词典
兼容离线模式
您可以自行训练声学模型(语音等),因此您可以根据您的环境和发音进行配置
- 您可以通过访问“PartialResult”获取实时结果
PocketSphinx 的缺点:您必须手动完成所有操作,从设置声学模型、字典、语言模型、阈值等(如果您想要简单的东西就太过分了)。