能否在Puppeteer中将可读流用作麦克风输入？

阿华AIGC实验室

2026-5-9

How to Route a Readable Stream as Microphone Input in Puppeteer

Absolutely, you can route a readable stream as microphone input in Puppeteer—though it requires combining Puppeteer's browser control with web audio APIs to pull this off. Let me walk you through how to do it:

Core Idea

To make this work, we need to:

Create a custom MediaStream in the browser that feeds audio data from your Node.js readable stream.
Override the browser's default getUserMedia behavior so pages use our custom stream instead of a real microphone.

Step-by-Step Implementation

1. Set Up Puppeteer with Required Flags

First, launch Puppeteer with flags that bypass real media device checks and permission prompts—these are critical for automation:

const browser = await puppeteer.launch({
  headless: false, // Audio APIs are often restricted in headless mode
  args: [
    '--use-fake-ui-for-media-stream', // Skip permission popups
    '--use-fake-device-for-media-stream' // Use fake media devices by default
  ]
});

2. Inject Custom Audio Stream Logic into the Page

We’ll expose a function from Node.js to the browser that creates our custom media stream, then override getUserMedia to prioritize it:

const puppeteer = require('puppeteer');
const { Readable } = require('stream');

async function run() {
  const browser = await puppeteer.launch({ headless: false, args: ['--use-fake-ui-for-media-stream', '--use-fake-device-for-media-stream'] });
  const page = await browser.newPage();

  // Example: Create a readable stream with fake sine wave data (replace with your actual stream)
  const audioStream = Readable.from(generateSineWavePCM(44100, 5));

  // Expose a function to the browser that builds our custom audio stream
  await page.exposeFunction('getCustomAudioStream', async () => {
    return page.evaluate(async () => {
      const audioContext = new AudioContext({ sampleRate: 44100 });
      const mediaStreamDest = audioContext.createMediaStreamDestination();
      const scriptProcessor = audioContext.createScriptProcessor(4096, 1, 1);

      // In production, replace this with logic to receive PCM chunks from your Node.js stream
      scriptProcessor.onaudioprocess = (e) => {
        const outputData = e.outputBuffer.getChannelData(0);
        for (let i = 0; i < e.outputBuffer.length; i++) {
          // Generate a 440Hz sine wave for demonstration
          outputData[i] = Math.sin(2 * Math.PI * 440 * (audioContext.currentTime + i / audioContext.sampleRate));
        }
      };

      scriptProcessor.connect(mediaStreamDest);
      await audioContext.resume(); // Required to start audio processing in modern browsers

      return mediaStreamDest.stream.id;
    });
  });

  // Override the browser's getUserMedia to use our custom stream for audio requests
  await page.evaluateOnNewDocument(async () => {
    const originalGetUserMedia = navigator.mediaDevices.getUserMedia;
    navigator.mediaDevices.getUserMedia = async (constraints) => {
      if (constraints.audio) {
        const customStreamId = await window.getCustomAudioStream();
        // Fetch our custom stream using its unique ID
        return navigator.mediaDevices.getUserMedia({
          audio: { deviceId: { exact: customStreamId } }
        });
      }
      // Fallback to original behavior for video requests
      return originalGetUserMedia(constraints);
    };
  });

  // Navigate to a page that uses microphone input (e.g., a voice recorder tool)
  await page.goto('https://example.com/voice-recorder');

  // Keep the browser open for testing; uncomment to close when done
  // await browser.close();
}

// Helper: Generate PCM data for a 440Hz sine wave (5 seconds duration)
function generateSineWavePCM(sampleRate, duration) {
  const numSamples = sampleRate * duration;
  const pcmData = [];
  for (let i = 0; i < numSamples; i++) {
    // Convert sine wave to standard PCM range (-1 to 1)
    pcmData.push(Math.sin(2 * Math.PI * 440 * i / sampleRate));
  }
  return pcmData;
}

run();

3. Adapt for Your Real Readable Stream

The example uses generated sine wave data—here’s how to modify it for your actual stream:

Transfer Data Chunks: Use page.exposeFunction or a WebSocket to send PCM chunks from your Node.js stream to the browser.
Match Audio Format: Ensure your stream outputs PCM data (16-bit integer or 32-bit float) that matches the AudioContext sample rate (typically 44100 or 48000 Hz).
Stream Handling: Replace the sine wave generation in onaudioprocess with logic to write incoming PCM chunks to the output buffer.

Key Notes

Headless Limitations: Audio APIs are often restricted in headless mode, so use headless: false for testing and debugging.
Resource Cleanup: Remember to close the AudioContext and stream when your script finishes to avoid memory leaks.
Permission Bypass: The --use-fake-ui-for-media-stream flag is essential to skip browser permission prompts during automation.

内容的提问来源于stack exchange，提问作者user12870226