如何使用Python将多个Excel文件及指定工作表的数据合并为单个表格或独立Excel文件？求最佳实践及建议

阿华AIGC实验室

2026-4-27

Python Excel Data Consolidation: Best Practices for Your Two Needs

Great question! Combining Excel data with Python is a task we all run into regularly, and there are reliable, efficient ways to handle both of your requirements. Let’s dive into step-by-step solutions, plus some pro tips to avoid common pitfalls.

1. Merge Multiple Excel Files into a Single Table

The go-to tool here is pandas—it’s designed for exactly this kind of tabular data manipulation. Here’s a robust workflow:

Step-by-Step Implementation

First, install the necessary libraries if you haven’t already:
```
pip install pandas openpyxl
```

Use this script to combine all your files into one table:

import pandas as pd
import glob

# Replace this with the path to your Excel files (supports wildcards like *.xlsx)
target_files = glob.glob("./your-excel-files/*.xlsx")

# Store each file's data in a list
data_frames = []
for file in target_files:
    # Read the first sheet by default; specify sheet_name="YourSheet" if needed
    df = pd.read_excel(file)
    # Optional: Add a column to track where each row came from (super helpful for debugging)
    df["source_file"] = file.split("/")[-1]  # Keep only the filename, not full path
    data_frames.append(df)

# Combine all DataFrames into one
# Use join="inner" if you only want columns present in ALL files; "outer" keeps all columns
combined_data = pd.concat(data_frames, ignore_index=True, join="outer")

# Save the result to a new Excel file
combined_data.to_excel("./combined_single_table.xlsx", index=False)

Key Notes

Handle mismatched columns: If your files have different column sets, join="outer" will retain all columns and fill missing values with NaN. Use join="inner" to only keep columns common to every file.
Large files? For files too big to fit in memory, use pd.read_excel(..., chunksize=10000) to process in batches and append incrementally to the output file.

Error resilience: Wrap the file reading in a try-except block to skip problematic files instead of crashing the whole script:

for file in target_files:
    try:
        df = pd.read_excel(file)
        df["source_file"] = file.split("/")[-1]
        data_frames.append(df)
    except Exception as e:
        print(f"Skipping {file}: {str(e)}")

2. Consolidate Specific Sheets into One Excel File (With Multiple Sheets)

If you need to keep each source sheet as a separate tab in the output file, pandas paired with ExcelWriter is still your best bet. This lets you write multiple DataFrames to different sheets in the same file.

Step-by-Step Implementation

Use this script to map files to their target sheets and consolidate:

import pandas as pd

# Define a dictionary mapping your Excel files to the specific sheets you want
# Format: {"path/to/file.xlsx": "SheetName"}
file_sheet_mapping = {
    "./data/sales_2023.xlsx": "Q4_Sales",
    "./data/inventory.xlsx": "Current_Stock",
    "./data/expenses.xlsx": "Office_Costs"
}

# Create an ExcelWriter object to handle multiple sheets
with pd.ExcelWriter("./consolidated_multisheet.xlsx", engine="openpyxl") as writer:
    for file_path, sheet_name in file_sheet_mapping.items():
        # Read the specified sheet
        df = pd.read_excel(file_path, sheet_name=sheet_name)
        # Customize the output sheet name (e.g., include the original filename for clarity)
        output_sheet_name = f"{file_path.split('/')[-1].replace('.xlsx', '')}_{sheet_name}"
        # Trim sheet names to 31 characters (Excel's limit)
        output_sheet_name = output_sheet_name[:31]
        # Write the data to the new sheet
        df.to_excel(writer, sheet_name=output_sheet_name, index=False)

Key Notes

Engine choice: Use engine="openpyxl" for .xlsx files (the modern format). For older .xls files, use engine="xlrd" (note: xlrd no longer supports .xlsx in recent versions).

Append to existing files: If you need to add sheets to an existing Excel file, load the workbook first and pass it to ExcelWriter:

from openpyxl import load_workbook

existing_wb = load_workbook("./existing_file.xlsx")
with pd.ExcelWriter("./existing_file.xlsx", engine="openpyxl", mode="a", if_sheet_exists="replace") as writer:
    writer.book = existing_wb
    # Add your sheet-writing code here

Extra Pro Tips

Preserve formatting: If you need to keep cell styles, formulas, or macros, use libraries like openpyxl (for reading/writing .xlsx) or xlwings (for deeper Excel integration, including macros). These are more verbose but give you fine-grained control.
Data validation: Before merging, check for consistent data types (e.g., ensure date columns are parsed correctly with pd.read_excel(..., parse_dates=["DateColumn"])) to avoid messy data.
Automate the process: Wrap your script in a function or use a tool like schedule to run it automatically at set intervals if you need to refresh the consolidated data regularly.

内容的提问来源于stack exchange，提问作者Jose Fernando Garzon Rodriguez