如何复用WebDriver实例循环执行GetInformation()并处理数据缺失
Solution to Your Web Scraping Loop & Driver Reuse Issue
Hey there! Let's get your code sorted to meet your needs—reusing a single browser instance, looping GetInformation() repeatedly, and triggering AnotherFunction() when data can't be fetched. Here's what we'll fix and the revised code:
Key Issues in Your Original Code
- You're creating a new
ChromeDriverevery time you callGetDriver(), which means a new browser window opens each time—we need to reuse one instance. - The
Mainmethod only runsGetInformation()once, with no loop logic. - No error handling for when elements can't be found (your "无法获取目标数据" scenario), so
AnotherFunction()never gets called. - The
CheckGame()method creates new drivers unnecessarily and has no way to exit the loop.
Revised Code with Explanations
using OpenQA.Selenium; using OpenQA.Selenium.Chrome; using System; using System.Text; namespace Parser { class Program { // 私有字段存储唯一的WebDriver实例,实现复用 private static ChromeDriver _driver; // 数据字段保持不变 public static string CommandName1; public static string CommandName2; public static string Command1Goals; public static string Command2Goals; public static string TimeOfGame; public static int TotalGoals; public static bool GameIsRun = true; static void Main(string[] args) { // 初始化一次浏览器实例 _driver = GetDriver(); // 循环执行GetInformation,直到GameIsRun变为false while (GameIsRun) { try { GetInformation(); } catch (NoSuchElementException) { // 当找不到元素(获取不到数据)时,执行AnotherFunction Console.WriteLine("Failed to fetch data, running AnotherFunction..."); AnotherFunction(); // 暂停2秒再继续尝试 System.Threading.Thread.Sleep(2000); } catch (Exception ex) { // 捕获其他意外异常 Console.WriteLine($"Unexpected error: {ex.Message}"); GameIsRun = false; // 出现未知错误时停止循环 } // 可选:让用户按Q键退出循环 if (Console.KeyAvailable && Console.ReadKey(true).Key == ConsoleKey.Q) { GameIsRun = false; Console.WriteLine("\nStopping the loop..."); } } // 程序结束时关闭浏览器 _driver.Quit(); } public static ChromeDriver GetDriver() { return new ChromeDriver(); } // 修改方法,直接使用类的私有字段_driver,无需重复传入 static void GetInformation() { // 优化:如果当前页面不是目标URL,再导航 if (_driver.Url != "https://m.favorit.com.ua/uk/live/events/13931514/") { _driver.Navigate().GoToUrl("https://m.favorit.com.ua/uk/live/events/13931514/"); } System.Threading.Thread.Sleep(3000); // 等待页面加载 // 获取元素,找不到会抛出NoSuchElementException CommandName1 = _driver.FindElement(By.XPath(".//*[@id='react-root']/div/div[2]/div/div[1]/div/div[1]/div/div[1]/div/header/div[2]/span[1]")).GetAttribute("innerHTML"); CommandName2 = _driver.FindElement(By.XPath(".//*[@id='react-root']/div/div[2]/div/div[1]/div/div[1]/div/div[1]/div/header/div[2]/span[2]")).GetAttribute("innerHTML"); Command1Goals = _driver.FindElement(By.XPath(".//*[@id='react-root']/div/div[2]/div/div[1]/div/div[1]/div/div[1]/div/header/div[4]/div[1]/div[1]")).GetAttribute("innerHTML"); Command2Goals = _driver.FindElement(By.XPath(".//*[@id='react-root']/div/div[2]/div/div[1]/div/div[1]/div/div[1]/div/header/div[4]/div[1]/div[2]")).GetAttribute("innerHTML"); TimeOfGame = _driver.FindElement(By.XPath("html/body/div/div/div//div/div[@class='headroom-wrapper']/div//div[3]/div")).GetAttribute("innerHTML"); TotalGoals = Convert.ToInt32(Command1Goals) + Convert.ToInt32(Command2Goals); Console.OutputEncoding = Encoding.UTF8; Console.WriteLine("\nTime: " + TimeOfGame + " \t\t|Total Goals: " + TotalGoals); Console.WriteLine("Team 1: " + CommandName1 + " \t|Goals: " + Command1Goals); Console.WriteLine("Team 2: " + CommandName2 + " \t|Goals: " + Command2Goals); System.Threading.Thread.Sleep(5000); // 循环间隔 } public static void AnotherFunction() { // 在这里添加你需要的逻辑,比如记录日志、重试其他方式等 Console.WriteLine("Executing AnotherFunction..."); } } }
What Changed & Why
- Reused WebDriver Instance: We added a private static
_driverfield to hold the single browser instance, initialized once inMain—no more multiple browser windows popping up. - Loop Logic: The
Mainmethod now runs awhileloop that callsGetInformation()repeatedly untilGameIsRunis false. - Error Handling: We wrapped
GetInformation()in atry-catchblock. IfFindElement()can't locate an element (throwingNoSuchElementException), we triggerAnotherFunction(). - Clean Exit: We added a way to exit the loop by pressing Q, and ensure the browser is properly closed with
_driver.Quit()when the program ends. - Minor Optimization: We check if we're already on the target URL before navigating, to avoid unnecessary page loads.
Additional Tips
- Instead of fixed
Thread.Sleep()calls, consider using Explicit Waits (likeWebDriverWait) to wait for elements to load—this is more reliable than fixed sleep times. For example:var wait = new WebDriverWait(_driver, TimeSpan.FromSeconds(10)); CommandName1 = wait.Until(d => d.FindElement(By.XPath("your-xpath"))).GetAttribute("innerHTML"); - If the game ends, you might want to set
GameIsRun = falseinsideGetInformation()based on theTimeOfGamevalue (e.g., if it shows "Full Time").
内容的提问来源于stack exchange,提问作者The Vee




