如何基于Puppeteer重放Chrome扩展记录的用户事件并生成HAR?
Hey there! Let's walk through how to implement replay for your recorded user actions and generate HAR files for each request during the process. I'll break this down into actionable steps based on your setup.
First, let's map out the big picture:
- Fetch all recorded events for a specific
recording_idfrom MongoDB, sorted by theirsequencenumber to maintain execution order - Replay each action (click, input, etc.) in a controlled browser environment
- Capture all network requests during replay and generate HAR files (either per-request or a full session HAR)
First, we need to pull the events tied to a recording and sort them correctly. Here's a quick Node.js example using the MongoDB driver:
const { MongoClient } = require('mongodb'); async function getSortedRecordingEvents(recordingId) { const client = await MongoClient.connect('your-mongodb-connection-string'); const db = client.db('your-database-name'); // Fetch events sorted by sequence to ensure correct execution order const events = await db.collection('your-events-collection') .find({ recording_id: recordingId }) .sort({ sequence: 1 }) .toArray(); await client.close(); return events; } // Don't forget to fetch the starting URL tied to the recording async function getRecordingStartUrl(recordingId) { const client = await MongoClient.connect('your-mongodb-connection-string'); const db = client.db('your-database-name'); const recording = await db.collection('your-recordings-collection').findOne({ _id: recordingId }); await client.close(); return recording.start_url; }
For reliable browser automation and network capture, I recommend using Playwright or Puppeteer—they’re built for this exact use case and handle edge cases like element loading delays out of the box. Let’s use Playwright for this example:
3.1 Initialize Browser & Load Starting URL
const { chromium } = require('playwright'); async function replayRecording(recordingId) { // Fetch events and start URL in parallel const [events, startUrl] = await Promise.all([ getSortedRecordingEvents(recordingId), getRecordingStartUrl(recordingId) ]); // Launch browser (set headless: true for production) const browser = await chromium.launch({ headless: false }); // Enable HAR recording at the context level (we'll tweak this later for per-request HARs) const context = await browser.newContext({ recordHar: { path: `full-session-${recordingId}.har`, // Full session HAR omitContent: false // Set to true if you don't need request/response bodies } }); const page = await context.newPage(); // Load the starting URL and wait for network to settle await page.goto(startUrl, { waitUntil: 'networkidle' });
3.2 Execute Each Recorded Event
Loop through the sorted events and run the corresponding action:
for (const event of events) { switch (event.command) { case 'click': // Wait for the target element to exist before clicking to avoid errors await page.waitForSelector(event.target); await page.click(event.target); break; case 'input': await page.waitForSelector(event.target); // Make sure your recorded input events include a `value` field with the typed text await page.fill(event.target, event.value); break; // Add cases for other commands (e.g., hover, submit) as needed default: console.warn(`Skipping unsupported command: ${event.command}`); } // Add a small delay to mimic real user pacing await page.waitForTimeout(500); } // Cleanup await browser.close(); }
If you need a separate HAR file for each individual request (instead of a full session), you can manually capture request/response details and write them to files:
// Add this inside the replayRecording function, right after creating the page const fs = require('fs'); let requestCounter = 0; page.on('response', async (response) => { const request = response.request(); // Build a HAR entry for this request const harEntry = { startedDateTime: request.startTime().toISOString(), time: response.responseTime(), request: { method: request.method(), url: request.url(), headers: request.headers(), queryString: request.url().includes('?') ? new URL(request.url()).searchParams : [] }, response: { status: response.status(), statusText: response.statusText(), headers: response.headers(), content: { size: (await response.body()).length, mimeType: response.headers()['content-type'] || '' } }, timings: { wait: response.responseTime(), blocked: -1, dns: -1, connect: -1, send: 0, receive: 0, ssl: -1 } }; // Wrap the entry in a valid HAR structure const harFileContent = { log: { version: '1.2', creator: { name: 'Action Replay Tool', version: '1.0' }, entries: [harEntry] } }; // Write to a unique file fs.writeFileSync(`request-${++requestCounter}-${Date.now()}.har`, JSON.stringify(harFileContent, null, 2)); });
- Element Stability: CSS selectors like
button.btn-smcan break if the page structure changes. Consider addingdata-testidattributes to key elements during recording for more reliable targeting. - Wait Strategies: Always use
waitForSelectororwaitForNetworkIdlebefore actions—never assume elements are immediately available. - Input Event Data: Double-check that your recorded input events store the
valueof what was typed (your example cut off, but this is critical for replay). - HAR Content: If you don’t need request/response bodies, set
omitContent: trueto reduce file size and improve performance.
内容的提问来源于stack exchange,提问作者Atul Singh




