You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

如何复用WebDriver实例循环执行GetInformation()并处理数据缺失

Solution to Your Web Scraping Loop & Driver Reuse Issue

Hey there! Let's get your code sorted to meet your needs—reusing a single browser instance, looping GetInformation() repeatedly, and triggering AnotherFunction() when data can't be fetched. Here's what we'll fix and the revised code:

Key Issues in Your Original Code

  • You're creating a new ChromeDriver every time you call GetDriver(), which means a new browser window opens each time—we need to reuse one instance.
  • The Main method only runs GetInformation() once, with no loop logic.
  • No error handling for when elements can't be found (your "无法获取目标数据" scenario), so AnotherFunction() never gets called.
  • The CheckGame() method creates new drivers unnecessarily and has no way to exit the loop.

Revised Code with Explanations

using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using System;
using System.Text;

namespace Parser {
    class Program {
        // 私有字段存储唯一的WebDriver实例,实现复用
        private static ChromeDriver _driver;

        // 数据字段保持不变
        public static string CommandName1;
        public static string CommandName2;
        public static string Command1Goals;
        public static string Command2Goals;
        public static string TimeOfGame;
        public static int TotalGoals;
        public static bool GameIsRun = true;

        static void Main(string[] args) {
            // 初始化一次浏览器实例
            _driver = GetDriver();

            // 循环执行GetInformation,直到GameIsRun变为false
            while (GameIsRun) {
                try {
                    GetInformation();
                } catch (NoSuchElementException) {
                    // 当找不到元素(获取不到数据)时,执行AnotherFunction
                    Console.WriteLine("Failed to fetch data, running AnotherFunction...");
                    AnotherFunction();
                    // 暂停2秒再继续尝试
                    System.Threading.Thread.Sleep(2000);
                } catch (Exception ex) {
                    // 捕获其他意外异常
                    Console.WriteLine($"Unexpected error: {ex.Message}");
                    GameIsRun = false; // 出现未知错误时停止循环
                }

                // 可选:让用户按Q键退出循环
                if (Console.KeyAvailable && Console.ReadKey(true).Key == ConsoleKey.Q) {
                    GameIsRun = false;
                    Console.WriteLine("\nStopping the loop...");
                }
            }

            // 程序结束时关闭浏览器
            _driver.Quit();
        }

        public static ChromeDriver GetDriver() {
            return new ChromeDriver();
        }

        // 修改方法,直接使用类的私有字段_driver,无需重复传入
        static void GetInformation() {
            // 优化:如果当前页面不是目标URL,再导航
            if (_driver.Url != "https://m.favorit.com.ua/uk/live/events/13931514/") {
                _driver.Navigate().GoToUrl("https://m.favorit.com.ua/uk/live/events/13931514/");
            }
            
            System.Threading.Thread.Sleep(3000); // 等待页面加载

            // 获取元素,找不到会抛出NoSuchElementException
            CommandName1 = _driver.FindElement(By.XPath(".//*[@id='react-root']/div/div[2]/div/div[1]/div/div[1]/div/div[1]/div/header/div[2]/span[1]")).GetAttribute("innerHTML");
            CommandName2 = _driver.FindElement(By.XPath(".//*[@id='react-root']/div/div[2]/div/div[1]/div/div[1]/div/div[1]/div/header/div[2]/span[2]")).GetAttribute("innerHTML");
            Command1Goals = _driver.FindElement(By.XPath(".//*[@id='react-root']/div/div[2]/div/div[1]/div/div[1]/div/div[1]/div/header/div[4]/div[1]/div[1]")).GetAttribute("innerHTML");
            Command2Goals = _driver.FindElement(By.XPath(".//*[@id='react-root']/div/div[2]/div/div[1]/div/div[1]/div/div[1]/div/header/div[4]/div[1]/div[2]")).GetAttribute("innerHTML");
            TimeOfGame = _driver.FindElement(By.XPath("html/body/div/div/div//div/div[@class='headroom-wrapper']/div//div[3]/div")).GetAttribute("innerHTML");
            
            TotalGoals = Convert.ToInt32(Command1Goals) + Convert.ToInt32(Command2Goals);
            
            Console.OutputEncoding = Encoding.UTF8;
            Console.WriteLine("\nTime: " + TimeOfGame + " \t\t|Total Goals: " + TotalGoals);
            Console.WriteLine("Team 1: " + CommandName1 + " \t|Goals: " + Command1Goals);
            Console.WriteLine("Team 2: " + CommandName2 + " \t|Goals: " + Command2Goals);
            
            System.Threading.Thread.Sleep(5000); // 循环间隔
        }

        public static void AnotherFunction() {
            // 在这里添加你需要的逻辑,比如记录日志、重试其他方式等
            Console.WriteLine("Executing AnotherFunction...");
        }
    }
}

What Changed & Why

  1. Reused WebDriver Instance: We added a private static _driver field to hold the single browser instance, initialized once in Main—no more multiple browser windows popping up.
  2. Loop Logic: The Main method now runs a while loop that calls GetInformation() repeatedly until GameIsRun is false.
  3. Error Handling: We wrapped GetInformation() in a try-catch block. If FindElement() can't locate an element (throwing NoSuchElementException), we trigger AnotherFunction().
  4. Clean Exit: We added a way to exit the loop by pressing Q, and ensure the browser is properly closed with _driver.Quit() when the program ends.
  5. Minor Optimization: We check if we're already on the target URL before navigating, to avoid unnecessary page loads.

Additional Tips

  • Instead of fixed Thread.Sleep() calls, consider using Explicit Waits (like WebDriverWait) to wait for elements to load—this is more reliable than fixed sleep times. For example:
    var wait = new WebDriverWait(_driver, TimeSpan.FromSeconds(10));
    CommandName1 = wait.Until(d => d.FindElement(By.XPath("your-xpath"))).GetAttribute("innerHTML");
    
  • If the game ends, you might want to set GameIsRun = false inside GetInformation() based on the TimeOfGame value (e.g., if it shows "Full Time").

内容的提问来源于stack exchange,提问作者The Vee

火山引擎 最新活动