You need to enable JavaScript to run this app.
优惠活动
大模型
产品
解决方案
定价
更多
文档控制台
免费开始使用

如何用Selenium通过CSS选择器定位需等待父DIV显示的Print元素?

问题:Selenium遍历患者文件夹时无法定位Print选项

我使用Selenium 4.27.1遍历医院门户网站的患者文件夹,目标是下载每个文件夹内的所有PDF文件。以Visit #269977为例,点击文件夹下方的Options链接会弹出操作菜单,我需要点击其中的Print选项。

单个文件夹可以通过XPATH定位元素,但遍历文件夹时,由于无法确定列表项索引,XPATH不再适用。我原本想用CSS选择器,却始终无法定位到可见的Print元素,尝试多种方法均返回“元素无法找到”的错误。以下是点击Options后的DOM结构及我编写的代码,寻求可行的定位方案(CSS选择器或其他替代方案)来实现点击Print选项的功能。

我的代码

#selenium version == 4.27.1
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import time

# Initial Login:
chrome_options = Options()
chrome_options.add_experimental_option("detach", True)
browser = webdriver.Chrome(options=chrome_options)
browser.get("https://...")
enterEmail = browser.find_element(By.XPATH, "//*[@id='email']")
enterEmail.send_keys("email address")
enterPwd = browser.find_element(By.XPATH, "//*[@id='password']")
enterPwd.send_keys("password")
time.sleep(2)
signIn = browser.find_element(By.XPATH, "//*[@id='login_btn']")
signIn.click()
time.sleep(2)

# Create Dictionary of patient information to loop through:
patientDict = {
'25-003049': ['John Doe', '267087'],
'25-003050': ['John Doe', '269226'],
'25-003051': ['John Doe', '275687']
}

# Create List of patients that errored during download for manual review:
errorList = []

for key in patientDict:

    try:
        # Declare Variables for patientDict keys and values:
        case_number = key
        patient_name = patientDict[key][0]
        visit_id = patientDict[key][1]

        browser.get("https://...")
        searchMenuOption = browser.find_element(By.CLASS_NAME, "search")
        searchMenuOption.click()
        searchName = browser.find_element(By.XPATH, "/html/body/div[1]/div[3]/div[2]/div[1]/div[1]/input")
        searchName.send_keys(patient_name)
        clickMagGlass = browser.find_element(By.XPATH, "/html/body/div[1]/div[3]/div[2]/div[1]/div[1]/button")
        clickMagGlass.click()
        clickName = browser.find_element(By.XPATH, "/html/body/div[1]/div[3]/div[2]/div[1]/ul/li")
        clickName.click()
        items = browser.find_elements(By.XPATH, "//*['/html/body/div[1]/div[4]/div[3]/div[2]/ul/li' and contains(text()," + visit_id + ")]")

        for item in items:
            x = item.find_elements(By.XPATH, ".//descendant::a")
            for y in x:
                y.click()

                # Can find absolute XPATH with list item and href indices:
                clickPrint = y.find_elements(By.XPATH, "//*[@id='documentlist']/li[5]/div[5]/a[5]")

                # It will find it now by CSS Selector based on Alohci's suggested update in the comments, but shows as not interactable if kept in a loop:
                clickPrint = y.find_elements(By.CSS_SELECTOR, ".tool.print")

                for printDialog in clickPrint:
                    printDialog.click()

                 # I've also tried it without the loop, but it returns an unable to locate element:
                clickPrint = y.find_element(By.CSS_SELECTOR, ".tool.print")
                clickPrint.click()

    except:
        errorList.append(patient_name)

点击Options后的DOM结构

<li class="full ui-selectee ui-droppable" data-ref-id="0" data-parent-id="" data-doc-id="406494" data-type="folder" data-uploaded="1688654514" style="">
   <div class="progress center-block open-progress">Opening<span>in progress</span></div>
   <h1 class="column title">
      <a href="#" class="folder " style="opacity: 0;"></a><span class="filename">Visit Jul. 6th, 2023</span>
   </h1>
   <span class="column options">Visit #269977 <a href="#" class="tool-toggle">Options</a>
   </span>
   <span class="column create-date">Jul 06, 2023 09:41:54AM</span>
   <div class="column content-icon">
      <span class="folder-legend">17</span>&nbsp;<span class="type">Folders</span>
   </div>
   <div class="column favorite-icon">
   </div>
   <div class="input with-btn big-border rename hidden">
      <input placeholder="New folder name" value="Visit Jul. 6th, 2023" type="text" name="new_folder_name406494">
      <button class="btn-primary">Ok</button>
   </div>
   <div class="tools open" style="display: block;">
       <a href="#" class="tool rename">Rename</a>
       <a href="#" class="tool comment">Comment</a>
       <a href="#" class="tool delete">Delete</a>
       <a href="#" class="tool export">Export</a>
       <a href="#" class="tool print">Print</a>
       <a href="#" class="no-tool-icon"></a>
   </div>

解决方案

1. 修正元素定位逻辑,基于当前文件夹节点操作

原代码遍历所有子链接并点击,会触发无关元素(比如文件夹本身),导致定位混乱。正确流程是:定位当前文件夹的Options按钮→点击展开菜单→在当前节点内查找Print按钮。

# 导入显式等待所需模块(放在代码开头)
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# 替换原有items循环内的代码
for item in items:
    # 定位当前文件夹的Options按钮
    options_btn = item.find_element(By.CSS_SELECTOR, ".tool-toggle")
    options_btn.click()
    # 等待Print按钮可见(显式等待替代time.sleep,更稳定)
    print_btn = WebDriverWait(item, 10).until(
        EC.visibility_of_element_located((By.CSS_SELECTOR, ".tools.open .tool.print"))
    )
    # 点击Print按钮
    print_btn.click()

2. 修正Visit ID的XPATH定位

原XPATH写法错误,无法正确匹配Visit ID,改为:

items = browser.find_elements(By.XPATH, f"//li[contains(text(), 'Visit #{visit_id}')]")

3. 用显式等待替代time.sleep

全局的time.sleep()不稳定,页面加载延迟时会导致操作失败。显式等待会等待元素满足条件后再执行,大幅提升脚本稳定性。

4. 避免全局查找元素

始终基于当前文件夹的item节点进行相对查找,确保操作的是当前文件夹对应的Print按钮,不会和其他文件夹的元素混淆。


内容的提问来源于stack exchange,提问作者fitted_sheet

火山引擎 最新活动