如何将自研语音识别引擎与Android的android.speech模块集成?
可以!将自研ASR引擎与android.speech模块集成的实现方案
当然可以把你的自研语音识别服务和Android的android.speech模块结合起来,核心思路是自定义一个RecognitionService子类,对接你的服务器API,让系统的SpeechRecognizer可以调用你的服务。下面是具体的实现步骤和关键代码:
1. 自定义RecognitionService
Android的RecognitionService是系统语音识别服务的基类,我们需要继承它并实现核心方法,完成音频采集、上传到你的API、返回识别结果的流程。
关键实现代码
public class CustomASRService extends RecognitionService { private MediaRecorder audioRecorder; private String audioFilePath; private Callback recognitionCallback; private ExecutorService executorService = Executors.newSingleThreadExecutor(); @Override protected void onStartListening(Intent recognizerIntent, Callback callback) { this.recognitionCallback = callback; // 初始化音频录制,生成WAV格式文件 initAudioRecorder(); try { audioRecorder.start(); } catch (IllegalStateException e) { callback.error(RecognitionService.ERROR_CLIENT, e.getMessage()); return; } } @Override protected void onStopListening(Callback callback) { if (audioRecorder != null) { audioRecorder.stop(); audioRecorder.release(); audioRecorder = null; // 异步上传音频到你的API executorService.submit(this::uploadAudioToAPI); } } @Override protected void onCancel(Callback callback) { if (audioRecorder != null) { audioRecorder.stop(); audioRecorder.release(); audioRecorder = null; callback.error(RecognitionService.ERROR_CLIENT, "Recognition cancelled"); } } private void initAudioRecorder() { audioFilePath = getExternalFilesDir(null) + "/temp_asr.wav"; audioRecorder = new MediaRecorder(); audioRecorder.setAudioSource(MediaRecorder.AudioSource.MIC); audioRecorder.setOutputFormat(MediaRecorder.OutputFormat.WAV); audioRecorder.setAudioEncoder(MediaRecorder.AudioEncoder.PCM_16BIT); audioRecorder.setAudioSamplingRate(16000); // 匹配你的API要求的采样率 audioRecorder.setAudioChannels(1); // 单声道,根据API调整 audioRecorder.setOutputFile(audioFilePath); try { audioRecorder.prepare(); } catch (IOException e) { recognitionCallback.error(RecognitionService.ERROR_CLIENT, "Failed to prepare recorder"); } } private void uploadAudioToAPI() { File audioFile = new File(audioFilePath); if (!audioFile.exists()) { recognitionCallback.error(RecognitionService.ERROR_CLIENT, "Audio file not found"); return; } // 这里用OkHttp实现上传请求(你可以换成自己熟悉的网络库) OkHttpClient client = new OkHttpClient(); RequestBody requestBody = new MultipartBody.Builder() .setType(MultipartBody.FORM) .addFormDataPart("audio", "temp_asr.wav", RequestBody.create(MediaType.parse("audio/wav"), audioFile)) .addFormDataPart("language", "zh-CN") // 可从recognizerIntent中动态获取语言参数 .build(); Request request = new Request.Builder() .url("http://192.168.1.100/ASR/demoSpeechToText") .post(requestBody) .build(); try { Response response = client.newCall(request).execute(); if (response.isSuccessful() && response.body() != null) { String result = response.body().string(); // 构造识别结果返回给客户端 Bundle results = new Bundle(); ArrayList<String> matches = new ArrayList<>(); matches.add(result); results.putStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION, matches); recognitionCallback.results(results); recognitionCallback.endOfSpeech(); } else { recognitionCallback.error(RecognitionService.ERROR_SERVER, "API request failed"); } } catch (IOException e) { recognitionCallback.error(RecognitionService.ERROR_NETWORK, e.getMessage()); } finally { // 删除临时音频文件 audioFile.delete(); } } }
2. 在AndroidManifest中注册服务
必须在清单文件中声明你的自定义服务,并添加对应的Intent Filter,让系统可以发现它是语音识别服务:
<manifest ...> <uses-permission android:name="android.permission.RECORD_AUDIO" /> <uses-permission android:name="android.permission.INTERNET" /> <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" /> <!-- 适配旧版本Android --> <application ...> <service android:name=".CustomASRService" android:exported="true"> <intent-filter> <action android:name="android.speech.RecognitionService" /> </intent-filter> </service> </application> </manifest>
3. 在应用中调用自定义ASR服务
现在你可以像使用系统语音识别服务一样,用SpeechRecognizer调用你的自研服务:
private SpeechRecognizer speechRecognizer; private Intent recognizerIntent; private void initSpeechRecognizer() { speechRecognizer = SpeechRecognizer.createSpeechRecognizer(this); speechRecognizer.setRecognitionListener(new RecognitionListener() { @Override public void onResults(Bundle results) { ArrayList<String> matches = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION); if (matches != null && !matches.isEmpty()) { String recognizedText = matches.get(0); // 处理识别结果 } } @Override public void onError(int error) { // 处理错误情况,比如网络异常、录制失败 } // 实现其他RecognitionListener方法(onReadyForSpeech、onBeginningOfSpeech等) }); recognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "zh-CN"); } // 开始识别 public void startRecognition() { if (ContextCompat.checkSelfPermission(this, Manifest.permission.RECORD_AUDIO) != PackageManager.PERMISSION_GRANTED) { ActivityCompat.requestPermissions(this, new String[]{Manifest.permission.RECORD_AUDIO}, 100); return; } speechRecognizer.startListening(recognizerIntent); } // 停止识别 public void stopRecognition() { speechRecognizer.stopListening(); }
关键注意事项
- 音频格式匹配:确保你录制的WAV文件参数(采样率、位深、声道数)和你的API要求完全一致,否则识别会失败。
- 异步处理:网络请求和音频录制都不能在主线程执行,一定要用线程池或协程处理,避免ANR。
- 权限处理:Android 6.0及以上需要动态申请
RECORD_AUDIO权限;如果你的API需要HTTPS,还要注意配置网络安全策略。 - 错误回调:务必在各种异常场景(录制失败、网络错误、API返回错误)下调用
Callback.error()通知客户端,保证用户体验。
内容的提问来源于stack exchange,提问作者AKA




