使用python-docx提取Word文档指定章节中的所有表格

使用python-docx提取Word文档指定章节中的所有表格

阿华AIGC实验室

2026-4-8

使用python-docx提取Word文档指定章节中的所有表格

没问题！我来手把手教你用python-docx精准提取Word文档里指定章节（比如你提到的Section 2）中的所有表格。先理清楚思路，再上实操代码：

第一步：安装依赖包

首先确保你已经装好了python-docx，没装的话在终端跑这条命令就行：

pip install python-docx

核心实现代码（带详细注释）

这个函数会帮你精准定位目标章节，只收集该章节内的所有表格，遇到下一个一级标题就自动停止，不会误捞其他章节的内容：

from docx import Document

def extract_tables_from_section(doc_path, target_section):
    doc = Document(doc_path)
    tables_in_section = []
    in_target_section = False

    # 遍历文档body里的所有元素（段落、表格等）
    for element in doc.element.body:
        # 处理段落元素，用来判断是否进入目标章节
        if element.tag.endswith('p'):
            para = doc._body._element_to_object(element)
            # 假设一级章节标题用的是Word默认的"Heading 1"样式
            # 要是你的文档用了自定义标题样式，记得替换成实际的样式名称
            if para.style.name == 'Heading 1':
                # 匹配到目标章节标题，开启收集开关
                if target_section in para.text:
                    in_target_section = True
                # 已经在目标章节里，遇到下一个一级标题就停止收集
                elif in_target_section:
                    break
        # 处于目标章节内且当前元素是表格，就把它收集起来
        if element.tag.endswith('tbl') and in_target_section:
            table = doc._body._element_to_object(element)
            tables_in_section.append(table)
    
    return tables_in_section

# 调用示例，直接用你的test_doc.docx测试
if __name__ == "__main__":
    doc_path = "test_doc.docx"
    target_section = "Section 2"
    extracted_tables = extract_tables_from_section(doc_path, target_section)
    
    # 把提取到的表格转成Markdown格式输出，方便你快速查看结果
    for table_num, table in enumerate(extracted_tables, 1):
        print(f"=== 第{table_num}个表格 ===")
        for row in table.rows:
            # 提取单元格内容并去掉多余空格
            cell_contents = [cell.text.strip() for cell in row.cells]
            # 打印成Markdown表格行
            print("| " + " | ".join(cell_contents) + " |")
        print()

几个要注意的细节

样式匹配要准确：代码里默认用的是Word自带的"Heading 1"样式，要是你的文档里Section标题用了自定义样式（比如叫"一级标题"），一定要把para.style.name改成对应的样式名
标题文本匹配：如果你的Section 2标题带编号（比如"2. Section 2"）或者额外空格，可以调整匹配逻辑，比如用para.text.strip() == "Section 2"精确匹配，或者用target_section.lower() in para.text.lower()忽略大小写
后续操作：提取到的table是python-docx的Table实例，你还能用它的API做更多操作，比如修改单元格内容、导出数据到Excel等

内容来源于stack exchange

火山引擎最新活动

方舟 Coding Plan

模型自由，工具不限，免费解锁 ArkClaw，7*24 小时在线的专属智能伙伴

一键部署 OpenClaw

分钟级部署，云服务器包月低至￥9.9，与 CodingPlan 组合购买仅需19.8元

Seedance2.0 体验中心上线

注册即享免费500万Tokens，抢先领略新一代AI视频技术跃迁

新用户特惠专场

大模型19元起，Al应用9.9元畅享，新人首购爆款尽享优惠