如何用Python读取Excel合并单元格的属性值（解决Ref和Amount属性值无法获取的问题）

阿华AIGC实验室

2026-4-13

我看了你的问题、代码和Excel的情况，马上就找到问题根源了——你没考虑到属性名本身是跨列合并的情况！比如"Ref"占了A、B两列，你原来的代码找到"Ref"所在的A列后，直接去取B列的值，但实际对应的值在C列，自然拿不到正确结果。下面给你详细分析和修改后的解决方案：

问题核心原因

对于"Ref"、"Amount"这类跨列合并的属性名，它们占了多列，对应的值在合并单元格最后一列的下一列，但你原代码直接用col_idx +1取值列，只适用于属性名占1列的情况。
你写的get_merged_cell_value函数是用来获取合并单元格的值，但没用到关键的地方——判断属性名本身的合并范围，导致值的列数计算错误。

修改后的完整代码

import os
import openpyxl
from openpyxl.utils import column_index_from_string

file_path = r"D:\file\input\example.xlsx"  
if os.path.exists(file_path):
    print("File exists!")
else:
    print("File not found! Check the path.")
    exit()

target_sheet = "Output Approval Templete"

# Define the properties to extract
properties = [ 
    "Approval Memo of",
    "Name of the Applicant",
    "Name of Territory",
    "Total Family Expenses",
    "Ref",
    "Amount",
    "Total Amount"
]

# 新增：获取单元格所在的合并范围（如果是合并单元格）
def get_cell_merged_range(sheet, row, col):
    for merged_range in sheet.merged_cells.ranges:
        min_row, min_col, max_row, max_col = merged_range.bounds
        if min_row <= row <= max_row and min_col <= col <= max_col:
            return (min_row, min_col, max_row, max_col)
    return None  # 不是合并单元格

# Function to get the actual value from a cell (including merged cells)
def get_merged_cell_value(sheet, row, col):
    merged_range = get_cell_merged_range(sheet, row, col)
    if merged_range:
        min_row, min_col, _, _ = merged_range
        return sheet.cell(min_row, min_col).value
    return sheet.cell(row, col).value

# Function to format numeric values properly
def format_value(value):
    if isinstance(value, float) and value > 1e10:  # Large numbers like NID
        return str(int(value))  # Convert to integer and string to avoid scientific notation
    elif isinstance(value, (int, float)):  # General number formatting
        return str(value)
    elif value is None:
        return "N/A"  # Handle missing values
    return str(value).strip()

try:
    # Load the workbook
    wb = openpyxl.load_workbook(file_path, data_only=True)

    if target_sheet not in wb.sheetnames:
        print(f"Sheet '{target_sheet}' not found in the file.")
    else:
        ws = wb[target_sheet]
        extracted_data = {}

        # Iterate over rows to extract data
        for row in ws.iter_rows():
            for cell in row:
                # 跳过空值或非字符串单元格，减少无效判断
                if not cell.value or not isinstance(cell.value, str):
                    continue
                prop_name = cell.value.strip()
                if prop_name in properties:
                    row_idx = cell.row
                    col_idx = cell.column

                    # 检查当前属性名是否在合并单元格中
                    merged_range = get_cell_merged_range(ws, row_idx, col_idx)
                    if merged_range:
                        # 合并单元格：取合并范围的最后一列，+1得到值的列
                        _, _, _, max_col = merged_range
                        value_col = max_col + 1
                    else:
                        # 非合并单元格：取当前列的下一列
                        value_col = col_idx + 1

                    # 确保值的列在工作表范围内
                    if value_col <= ws.max_column:
                        value = get_merged_cell_value(ws, row_idx, value_col)
                        extracted_data[prop_name] = format_value(value)

        # Print extracted values
        for key, value in extracted_data.items():
            print(f"{key} : {value}")

except Exception as e:
    print(f"Error loading workbook: {e}")

关键修改点说明

新增get_cell_merged_range函数：用来判断属性名所在单元格是否属于合并范围，返回合并区域的起止行/列，这样就能知道属性名占了多少列。
优化值列的计算逻辑：
- 如果属性名在合并单元格里，取合并范围的最后一列max_col，值的列就是max_col +1（比如"Ref"占A、B列，max_col=2，值列就是3即C列）
- 如果不是合并单元格，保持原逻辑取当前列+1
跳过无效单元格：新增判断跳过空值或非字符串单元格，减少不必要的循环判断

你运行修改后的代码，就能正确获取到"Ref"和"Amount"的对应值了！

备注：内容来源于stack exchange，提问作者Enamul Haque