如何用Selenium通过CSS选择器定位需等待父DIV显示的Print元素?
问题:Selenium遍历患者文件夹时无法定位Print选项
我使用Selenium 4.27.1遍历医院门户网站的患者文件夹,目标是下载每个文件夹内的所有PDF文件。以Visit #269977为例,点击文件夹下方的Options链接会弹出操作菜单,我需要点击其中的Print选项。
单个文件夹可以通过XPATH定位元素,但遍历文件夹时,由于无法确定列表项索引,XPATH不再适用。我原本想用CSS选择器,却始终无法定位到可见的Print元素,尝试多种方法均返回“元素无法找到”的错误。以下是点击Options后的DOM结构及我编写的代码,寻求可行的定位方案(CSS选择器或其他替代方案)来实现点击Print选项的功能。
我的代码
#selenium version == 4.27.1 from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.chrome.options import Options import time # Initial Login: chrome_options = Options() chrome_options.add_experimental_option("detach", True) browser = webdriver.Chrome(options=chrome_options) browser.get("https://...") enterEmail = browser.find_element(By.XPATH, "//*[@id='email']") enterEmail.send_keys("email address") enterPwd = browser.find_element(By.XPATH, "//*[@id='password']") enterPwd.send_keys("password") time.sleep(2) signIn = browser.find_element(By.XPATH, "//*[@id='login_btn']") signIn.click() time.sleep(2) # Create Dictionary of patient information to loop through: patientDict = { '25-003049': ['John Doe', '267087'], '25-003050': ['John Doe', '269226'], '25-003051': ['John Doe', '275687'] } # Create List of patients that errored during download for manual review: errorList = [] for key in patientDict: try: # Declare Variables for patientDict keys and values: case_number = key patient_name = patientDict[key][0] visit_id = patientDict[key][1] browser.get("https://...") searchMenuOption = browser.find_element(By.CLASS_NAME, "search") searchMenuOption.click() searchName = browser.find_element(By.XPATH, "/html/body/div[1]/div[3]/div[2]/div[1]/div[1]/input") searchName.send_keys(patient_name) clickMagGlass = browser.find_element(By.XPATH, "/html/body/div[1]/div[3]/div[2]/div[1]/div[1]/button") clickMagGlass.click() clickName = browser.find_element(By.XPATH, "/html/body/div[1]/div[3]/div[2]/div[1]/ul/li") clickName.click() items = browser.find_elements(By.XPATH, "//*['/html/body/div[1]/div[4]/div[3]/div[2]/ul/li' and contains(text()," + visit_id + ")]") for item in items: x = item.find_elements(By.XPATH, ".//descendant::a") for y in x: y.click() # Can find absolute XPATH with list item and href indices: clickPrint = y.find_elements(By.XPATH, "//*[@id='documentlist']/li[5]/div[5]/a[5]") # It will find it now by CSS Selector based on Alohci's suggested update in the comments, but shows as not interactable if kept in a loop: clickPrint = y.find_elements(By.CSS_SELECTOR, ".tool.print") for printDialog in clickPrint: printDialog.click() # I've also tried it without the loop, but it returns an unable to locate element: clickPrint = y.find_element(By.CSS_SELECTOR, ".tool.print") clickPrint.click() except: errorList.append(patient_name)
点击Options后的DOM结构
<li class="full ui-selectee ui-droppable" data-ref-id="0" data-parent-id="" data-doc-id="406494" data-type="folder" data-uploaded="1688654514" style=""> <div class="progress center-block open-progress">Opening<span>in progress</span></div> <h1 class="column title"> <a href="#" class="folder " style="opacity: 0;"></a><span class="filename">Visit Jul. 6th, 2023</span> </h1> <span class="column options">Visit #269977 <a href="#" class="tool-toggle">Options</a> </span> <span class="column create-date">Jul 06, 2023 09:41:54AM</span> <div class="column content-icon"> <span class="folder-legend">17</span> <span class="type">Folders</span> </div> <div class="column favorite-icon"> </div> <div class="input with-btn big-border rename hidden"> <input placeholder="New folder name" value="Visit Jul. 6th, 2023" type="text" name="new_folder_name406494"> <button class="btn-primary">Ok</button> </div> <div class="tools open" style="display: block;"> <a href="#" class="tool rename">Rename</a> <a href="#" class="tool comment">Comment</a> <a href="#" class="tool delete">Delete</a> <a href="#" class="tool export">Export</a> <a href="#" class="tool print">Print</a> <a href="#" class="no-tool-icon"></a> </div>
解决方案
1. 修正元素定位逻辑,基于当前文件夹节点操作
原代码遍历所有子链接并点击,会触发无关元素(比如文件夹本身),导致定位混乱。正确流程是:定位当前文件夹的Options按钮→点击展开菜单→在当前节点内查找Print按钮。
# 导入显式等待所需模块(放在代码开头) from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC # 替换原有items循环内的代码 for item in items: # 定位当前文件夹的Options按钮 options_btn = item.find_element(By.CSS_SELECTOR, ".tool-toggle") options_btn.click() # 等待Print按钮可见(显式等待替代time.sleep,更稳定) print_btn = WebDriverWait(item, 10).until( EC.visibility_of_element_located((By.CSS_SELECTOR, ".tools.open .tool.print")) ) # 点击Print按钮 print_btn.click()
2. 修正Visit ID的XPATH定位
原XPATH写法错误,无法正确匹配Visit ID,改为:
items = browser.find_elements(By.XPATH, f"//li[contains(text(), 'Visit #{visit_id}')]")
3. 用显式等待替代time.sleep
全局的time.sleep()不稳定,页面加载延迟时会导致操作失败。显式等待会等待元素满足条件后再执行,大幅提升脚本稳定性。
4. 避免全局查找元素
始终基于当前文件夹的item节点进行相对查找,确保操作的是当前文件夹对应的Print按钮,不会和其他文件夹的元素混淆。
内容的提问来源于stack exchange,提问作者fitted_sheet




