能否通过Bash脚本调用Android TTS引擎生成音频文件？

阿华AIGC实验室

2026-5-15

Absolutely! You can absolutely build a Bash script to leverage Android's TTS engine for generating audio files—your idea of using ADB to bridge the Android app and command-line tools like lame is spot-on. Here's a step-by-step breakdown to make this work:

Core Approach

Since Android doesn't ship with a native CLI tool for TTS, we'll use a tiny custom Android app to handle the TTS synthesis, then pipe the raw PCM output from the device to lame (or another audio encoder) on your computer via ADB. The script will handle pushing input text, triggering the app, and converting the output to your desired audio format.

Step 1: Build a Minimal Android TTS App

Since you're familiar with Android development, this will be straightforward. We'll create an app that reads input text from a file on the device, uses TextToSpeech to generate raw PCM audio, and saves it to a temporary location for ADB to access.

Here's the key code for the main Activity (adjust package names as needed):

package com.example.ttscli;

import android.os.Bundle;
import android.speech.tts.TextToSpeech;
import android.util.Log;
import androidx.appcompat.app.AppCompatActivity;
import java.util.HashMap;
import java.util.Locale;

public class TtsCliActivity extends AppCompatActivity implements TextToSpeech.OnInitListener {
    private TextToSpeech tts;
    private String inputFilePath;
    private final String outputFilePath = "/data/local/tmp/tts_output.pcm";

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        inputFilePath = getIntent().getStringExtra("input");
        tts = new TextToSpeech(this, this);
    }

    @Override
    public void onInit(int status) {
        if (status == TextToSpeech.SUCCESS) {
            // Set your desired language (adjust Locale as needed)
            int langResult = tts.setLanguage(Locale.US);
            if (langResult == TextToSpeech.LANG_MISSING_DATA || langResult == TextToSpeech.LANG_NOT_SUPPORTED) {
                Log.e("TTSCli", "Selected language not supported");
                cleanupAndExit();
                return;
            }

            // Configure TTS to output raw PCM
            HashMap<String, String> ttsParams = new HashMap<>();
            ttsParams.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, "tts-cli-utterance");
            
            // Synthesize text to PCM file
            int synthResult = tts.synthesizeToFile(inputFilePath, ttsParams, outputFilePath);
            if (synthResult == TextToSpeech.ERROR) {
                Log.e("TTSCli", "Failed to synthesize text");
                cleanupAndExit();
                return;
            }

            // Wait for synthesis to finish (simplified; for production, use a listener)
            try {
                Thread.sleep(2000); // Adjust based on text length
            } catch (InterruptedException e) {
                e.printStackTrace();
            }

            cleanupAndExit();
        } else {
            Log.e("TTSCli", "TTS engine initialization failed");
            cleanupAndExit();
        }
    }

    private void cleanupAndExit() {
        if (tts != null) {
            tts.stop();
            tts.shutdown();
        }
        finish();
    }
}

Don't forget to add the required permission to your AndroidManifest.xml:

<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />

(Note: Using /data/local/tmp may not require storage permissions on most devices, but it's safe to include just in case.)

Step 2: Write the Bash Script (`android-tts.sh`)

This script will handle all the command-line logic: validating inputs, pushing text to the device, triggering the TTS app, pulling the PCM data, converting it to MP3, and cleaning up.

#!/bin/bash
set -euo pipefail

# Validate input arguments
if [ $# -ne 2 ]; then
    echo "Usage: $0 <text-file> <audio-file>"
    echo "Example: $0 speech.txt output.mp3"
    exit 1
fi

TEXT_FILE="$1"
AUDIO_FILE="$2"
DEVICE_INPUT="/data/local/tmp/tts_input.txt"
DEVICE_OUTPUT="/data/local/tmp/tts_output.pcm"

# Check if ADB is connected
if ! adb devices | grep -q "device$"; then
    echo "Error: No Android device connected via ADB. Ensure USB debugging is enabled."
    exit 1
fi

# Push input text file to device
echo "Pushing input text to device..."
adb push "$TEXT_FILE" "$DEVICE_INPUT"

# Launch TTS app to generate PCM audio
echo "Generating audio with Android TTS..."
adb shell "am start -n com.example.ttscli/.TtsCliActivity --es input '$DEVICE_INPUT'"

# Wait for synthesis to complete (adjust sleep time based on text length)
sleep 3

# Convert raw PCM to MP3 using lame
echo "Converting PCM to MP3..."
adb shell "cat '$DEVICE_OUTPUT'" | lame -r -s 16 -b 128 -m m - "$AUDIO_FILE"

# Clean up temporary files on device
echo "Cleaning up temporary files..."
adb shell "rm '$DEVICE_INPUT' '$DEVICE_OUTPUT'"

echo "Success! Audio saved to: $AUDIO_FILE"

Step 3: Make It Work

Install the app: Build and install your TTS CLI app on your connected Android device (debug build is fine).
Set up dependencies: Ensure lame is installed on your computer (e.g., brew install lame on macOS, sudo apt install lame on Ubuntu).
Make the script executable: Run chmod +x android-tts.sh.
Test it: Run ./android-tts.sh your-text-file.txt output.mp3.

Key Notes & Improvements

Adjust timing: The sleep commands are a quick-and-dirty way to wait for synthesis. For longer texts, you could modify the Android app to send a broadcast when synthesis finishes, then have the script wait for that broadcast via ADB.
TTS customization: In the Android code, you can adjust parameters like speech rate (tts.setSpeechRate(1.0f)), pitch, or select a different TTS engine.
Alternative encoders: If you prefer WAV or OGG, replace lame with ffmpeg (e.g., ffmpeg -f s16le -ar 16000 -ac 1 -i - "$AUDIO_FILE" for WAV).
Error handling: The script includes basic error checking, but you could expand it to handle cases where the TTS app fails to generate audio.

内容的提问来源于stack exchange，提问作者ka3ak