如何在网站中使用客户端库调用Google Cloud Text-to-Speech API实现文本转语音功能

阿华AIGC实验室

2026-4-27

How to Use Google Cloud Text-to-Speech API in Your Website Without Web Speech API

Got it, let’s break this down clearly—you don’t need to install client libraries directly on your web host’s control panel the way you would for a local app. Instead, you’ll use a secure backend proxy (the production-safe approach) or the REST API directly (only for testing, due to security risks). Here’s how to do both, with practical examples:

First, Lay the Google Cloud Groundwork

Before touching your website, make sure you have these setup steps done in Google Cloud Console:

Create a new project (or use an existing one)
Enable the Cloud Text-to-Speech API for your project
Create a service account, generate a JSON key file, and download it (keep this file locked down—never share it publicly)
Assign the Text-to-Speech Editor role (or equivalent) to your service account

Option 1: Backend Proxy (Recommended for Production)

This is the right way to go because it keeps your Google Cloud credentials hidden from the frontend. You’ll build a simple backend endpoint that your frontend calls; the backend uses the Google Cloud client library to talk to the TTS API, then sends the audio back to your site.

Example for Node.js Backend

If your host supports Node.js (most modern hosts do):

Install the client library: Use SSH access to your host (or your control panel’s built-in terminal, like cPanel’s Terminal) to run:
```
npm install @google-cloud/text-to-speech
```

Write your endpoint (using Express as an example):

const express = require('express');
const { TextToSpeechClient } = require('@google-cloud/text-to-speech');
const app = express();
app.use(express.json());

// Load your service account key (never hardcode it—use env vars if possible!)
const client = new TextToSpeechClient({
  keyFilename: './secure-directory/your-service-account-key.json'
});

app.post('/convert-text-to-speech', async (req, res) => {
  const { text } = req.body;
  if (!text) return res.status(400).send('No text provided');

  const request = {
    input: { text: text },
    voice: { languageCode: 'en-US', ssmlGender: 'NEUTRAL' },
    audioConfig: { audioEncoding: 'MP3' },
  };

  try {
    const [response] = await client.synthesizeSpeech(request);
    res.set('Content-Type', 'audio/mpeg');
    res.send(response.audioContent);
  } catch (err) {
    console.error('API Error:', err);
    res.status(500).send('Failed to generate audio');
  }
});

app.listen(3000, () => console.log('TTS proxy running on port 3000'));

Deploy the backend: Upload the files to your host via FTP/Git, then start the server following your host’s Node.js deployment docs.

Frontend code to trigger conversion:

const textInput = document.getElementById('text-input');
const convertBtn = document.getElementById('convert-btn');
const audioPlayer = document.getElementById('audio-player');

convertBtn.addEventListener('click', async () => {
  const text = textInput.value.trim();
  if (!text) return;

  try {
    const response = await fetch('/convert-text-to-speech', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ text: text })
    });

    if (!response.ok) throw new Error('Request failed');
    const audioBlob = await response.blob();
    const audioUrl = URL.createObjectURL(audioBlob);
    audioPlayer.src = audioUrl;
    audioPlayer.play();
  } catch (err) {
    alert('Oops, failed to generate audio: ' + err.message);
  }
});

Example for PHP Backend

If your host uses PHP:

Install the client library: Run this via SSH/control panel terminal:
```
composer require google/cloud-text-to-speech
```

Write your endpoint (e.g., convert.php):

<?php
require __DIR__ . '/vendor/autoload.php';

use Google\Cloud\TextToSpeech\V1\AudioConfig;
use Google\Cloud\TextToSpeech\V1\AudioEncoding;
use Google\Cloud\TextToSpeech\V1\SynthesisInput;
use Google\Cloud\TextToSpeech\V1\TextToSpeechClient;
use Google\Cloud\TextToSpeech\V1\VoiceSelectionParams;

// Load service account key (store this file in a non-web-accessible directory)
putenv('GOOGLE_APPLICATION_CREDENTIALS=' . __DIR__ . '/secure-directory/your-service-account-key.json');

$input = json_decode(file_get_contents('php://input'), true);
if (!isset($input['text']) || empty($input['text'])) {
  http_response_code(400);
  echo json_encode(['error' => 'No text provided']);
  exit;
}

try {
  $client = new TextToSpeechClient();
  $synthesisInput = new SynthesisInput();
  $synthesisInput->setText($input['text']);

  $voice = new VoiceSelectionParams();
  $voice->setLanguageCode('en-US');
  $voice->setSsmlGender(\Google\Cloud\TextToSpeech\V1\SsmlVoiceGender::NEUTRAL);

  $audioConfig = new AudioConfig();
  $audioConfig->setAudioEncoding(AudioEncoding::MP3);

  $response = $client->synthesizeSpeech($synthesisInput, $voice, $audioConfig);
  $audioContent = $response->getAudioContent();

  // Send audio back as MP3
  header('Content-Type: audio/mpeg');
  echo $audioContent;
} catch (Exception $e) {
  http_response_code(500);
  echo json_encode(['error' => 'Failed to convert text: ' . $e->getMessage()]);
}
?>

Frontend code works the same as the Node.js example—just point your fetch call to convert.php.

Option 2: Direct REST API Call (Testing Only!)

You can call the TTS REST API directly from your frontend, but never do this in production—you’d have to expose your credentials or a short-lived token that can be stolen. For testing purposes:

Generate a short-lived access token using your service account key (use the gcloud CLI locally or a quick script).

Use fetch to call the API:

async function convertTextForTesting(text) {
  const accessToken = 'YOUR_SHORT_LIVED_TOKEN';
  const response = await fetch('https://texttospeech.googleapis.com/v1/text:synthesize', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${accessToken}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      input: { text: text },
      voice: { languageCode: 'en-US', ssmlGender: 'NEUTRAL' },
      audioConfig: { audioEncoding: 'MP3' }
    })
  });
  const data = await response.json();
  const audioBlob = new Blob([Uint8Array.from(atob(data.audioContent), c => c.charCodeAt(0))], { type: 'audio/mpeg' });
  return URL.createObjectURL(audioBlob);
}

Key Host Setup Tips

Installing libraries: Most hosts let you use SSH to run npm/composer commands. If you don’t have SSH access, upload the library files via your control panel’s File Manager (though package managers are easier).
Credential security: Store your service account key in a directory that’s not accessible via the web. Many hosts let you set environment variables in the control panel to load the key without hardcoding it.

内容的提问来源于stack exchange，提问作者VKK