使用Python Selenium按日期提取指定网站表格:日期选择及年月选择问题求助
解决Selenium中Dynarch日历控件的年月选择问题
嘿,刚看了你的代码和问题,你已经迈出了第一步——成功打开日历控件,但确实,这类Dynarch风格的日历需要先切换年月才能选到非当前月的日期。我来给你详细说说怎么实现,分两种方法,一种直接展开年月选择框(更高效),另一种通过箭头切换(适合批量操作)。
方法一:直接点击年月标题选择(推荐)
这类日历通常支持点击顶部的年月标题,弹出年份和月份的选择列表,直接点选目标值就行,不用反复点箭头。下面是修改后的代码:
from selenium import webdriver import time from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.common.by import By test_url = "https://www.fpi.nsdl.co.in/web/Reports/Archive.aspx" driver = webdriver.Chrome(r"C:\Users\prash\Music\fii\chromedriver.exe") driver.get(test_url) # 目标日期参数(根据实际需求修改,注意月份文本要和日历显示一致,比如是"October"还是"Oct") target_year = "2023" target_month = "October" expected_to_date = '17' # 显式等待日历按钮可点击,替代time.sleep更稳定 WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, "imgtxtDate"))).click() time.sleep(0.5) # 点击年份标题,展开年份选择面板 year_element = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CLASS_NAME, "DynarchCalendar-year"))) year_element.click() # 选择目标年份 WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, f"//td[@class='DynarchCalendar-yeartable']/a[text()='{target_year}']"))).click() time.sleep(0.5) # 点击月份标题,展开月份选择面板 month_element = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CLASS_NAME, "DynarchCalendar-month"))) month_element.click() # 选择目标月份 WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, f"//td[@class='DynarchCalendar-monthtable']/a[text()='{target_month}']"))).click() time.sleep(0.5) # 选择目标日期 date_element = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, f"//td[not(contains(@class,'DynarchCalendar-firstcol'))]/a[text()='{expected_to_date}']"))) date_element.click() # 后续操作... # driver.quit()
方法二:通过箭头按钮切换年月
如果遇到某些场景不能直接展开选择面板,就用箭头循环切换到目标年月:
from selenium import webdriver import time test_url = "https://www.fpi.nsdl.co.in/web/Reports/Archive.aspx" driver = webdriver.Chrome(r"C:\Users\prash\Music\fii\chromedriver.exe") driver.get(test_url) time.sleep(1) target_year = "2023" target_month = "October" expected_to_date = '17' # 打开日历 driver.find_element_by_id("imgtxtDate").click() time.sleep(1) # 切换年份 current_year = driver.find_element_by_class_name("DynarchCalendar-year").text while current_year != target_year: # 判断切换方向 if int(target_year) < int(current_year): # 点击上一年箭头 driver.find_element_by_xpath("//td[@class='DynarchCalendar-yearnav']/a[@title='Previous year']").click() else: # 点击下一年箭头 driver.find_element_by_xpath("//td[@class='DynarchCalendar-yearnav']/a[@title='Next year']").click() time.sleep(0.5) current_year = driver.find_element_by_class_name("DynarchCalendar-year").text # 切换月份 month_order = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"] current_month = driver.find_element_by_class_name("DynarchCalendar-month").text while current_month != target_month: if month_order.index(target_month) < month_order.index(current_month): # 点击上月箭头 driver.find_element_by_xpath("//td[@class='DynarchCalendar-monthnav']/a[@title='Previous month']").click() else: # 点击下月箭头 driver.find_element_by_xpath("//td[@class='DynarchCalendar-monthnav']/a[@title='Next month']").click() time.sleep(0.5) current_month = driver.find_element_by_class_name("DynarchCalendar-month").text # 选择日期 from_day = driver.find_element_by_xpath("//td[not(contains(@class,'DynarchCalendar-firstcol'))]/a[text()='" + expected_to_date + "']") from_day.click() time.sleep(2)
注意事项
- 确认文本格式:一定要用浏览器开发者工具(F12)查看日历里的月份/年份文本是全称(比如"October")还是缩写(比如"Oct"),不然会找不到元素。
- 优先用显式等待:我在方法一里加入了
WebDriverWait,比time.sleep更可靠,能避免页面加载慢导致的元素找不到问题。 - 元素定位验证:如果运行时找不到元素,检查XPath或Class Name是否和页面实际结构匹配,因为网站可能会更新控件样式。
内容的提问来源于stack exchange,提问作者Prashanth G




