如何在Puppeteer的page.goto(url)中循环访问多个链接？

如何在Puppeteer的page.goto(url)中循环访问多个链接？

阿华AIGC实验室

2026-5-13

没问题，我来帮你调整代码实现循环访问这些序列链接～

首先，咱们可以优化原代码的结构，把浏览器的启动和关闭放在循环外面（避免每次循环都重启浏览器，大幅提升效率），然后用循环生成对应的URL即可。下面给你两种常见的实现方案：

方案1：固定次数循环（比如访问c-0到c-9）

如果你知道要访问的链接数量，可以直接用for循环：

const puppeteer = require("puppeteer");

(async () => {
  // 只启动一次浏览器
  const browser = await puppeteer.launch({ headless: true });
  const page = await browser.newPage();
  
  // 循环访问从c-0到c-9的链接，可自行修改循环范围
  for (let i = 0; i < 10; i++) {
    try {
      const targetUrl = `url/c-${i}`;
      await page.goto(targetUrl);
      
      // 给等待元素加超时，防止某个页面加载过慢卡死程序
      await page.waitForSelector('.box-chap', { timeout: 5000 });
      
      const element = await page.$(".box-chap");
      const content = await page.evaluate(el => el.textContent, element);
      console.log(`${content} chapter ${i}`);
    } catch (error) {
      // 单个链接出错不中断整个循环，打印错误信息即可
      console.error(`处理链接c-${i}时出错:`, error.message);
    }
  }
  
  // 所有链接处理完后关闭浏览器
  await browser.close();
})();

方案2：自动循环直到页面不存在

如果你不确定要循环多少次，想一直访问直到目标页面无法找到（比如找不到.box-chap元素），可以用while循环：

const puppeteer = require("puppeteer");

(async () => {
  const browser = await puppeteer.launch({ headless: true });
  const page = await browser.newPage();
  let currentIndex = 0;
  let continueLoop = true;
  
  while (continueLoop) {
    try {
      const targetUrl = `url/c-${currentIndex}`;
      await page.goto(targetUrl);
      
      // 等待目标元素，超时则判定页面不存在
      await page.waitForSelector('.box-chap', { timeout: 5000 });
      
      const element = await page.$(".box-chap");
      const content = await page.evaluate(el => el.textContent, element);
      console.log(`${content} chapter ${currentIndex}`);
      
      // 成功获取内容，继续下一个链接
      currentIndex++;
    } catch (error) {
      console.log(`c-${currentIndex} 页面不存在或加载失败，停止循环`);
      continueLoop = false;
    }
  }
  
  await browser.close();
})();

一些额外的小提示：

如果网站加载较慢，可以把waitForSelector的超时时间调大（比如改成10000，单位是毫秒）
要是遇到反爬机制，可以在每次访问后加个延迟：await page.waitForTimeout(1000);（等待1秒），避免被网站封禁
部分网站会检测无头浏览器，你可以给launch添加参数规避：args: ['--no-sandbox', '--disable-setuid-sandbox']

内容的提问来源于stack exchange，提问作者Tobi

火山引擎最新活动

方舟 Coding Plan

模型自由，工具不限，免费解锁 ArkClaw，7*24 小时在线的专属智能伙伴

一键部署 OpenClaw

分钟级部署，云服务器包月低至￥9.9，与 CodingPlan 组合购买仅需19.8元

Seedance2.0 体验中心上线

注册即享免费500万Tokens，抢先领略新一代AI视频技术跃迁

新用户特惠专场

大模型19元起，Al应用9.9元畅享，新人首购爆款尽享优惠