如何用Pandas读取含嵌套列的Excel文件并解决列名异常问题

阿华AIGC实验室

2026-5-7

解决Pandas读取Excel列名显示Unnamed的问题及转JSON方法

Hey there, let’s figure out why you’re getting those annoying Unnamed: column names and fix it, plus walk through how to convert your cleaned data to JSON afterward.

一、Fixing the Unnamed Column Issue

The most common reason you’re seeing Unnamed: columns is that your Excel’s header isn’t a single, clean row—maybe it’s split across multiple rows, uses merged cells, or skiprows=6 isn’t targeting the right starting point for your actual column names. Here are a few ways to fix this:

1. First, map out your Excel’s structure

Before diving into fixing, let’s see exactly what’s in those top rows. Skip setting headers entirely to inspect the raw data:

import pandas as pd
# Read the entire file without setting headers
raw_df = pd.read_excel('filename.xlsx')
# Print the first 10 rows to spot where your actual column names live
print(raw_df.head(10))

Once you find which rows make up your header (e.g., rows 7 and 8 if you skipped 6, which are index 6 and 7 in zero-based numbering), use the header parameter to specify that range:

# Use a list to tell Pandas which rows form the header
df = pd.read_excel('filename.xlsx', header=[6, 7])
# If you end up with multi-level headers, you can merge them into single names
df.columns = ['_'.join(str(col).strip() for col in col_tuple) for col_tuple in df.columns]
# Clean up any leftover "Unnamed" parts
df.columns = [col.replace('_Unnamed', '') for col in df.columns]

2. Manually define column names

If your header is super messy (like scattered across non-consecutive rows), skip past all header rows and set the column names yourself:

# Skip all header rows (adjust skiprows to start at your first data row)
df = pd.read_excel('filename.xlsx', skiprows=7)
# Create a list of your actual column names, in the same order as the Excel columns
custom_cols = ['Customer ID', 'Order Date', 'Product Name', ...]  # Replace with your real names
df.columns = custom_cols

3. Handle merged cells with openpyxl

Merged cells in Excel often throw Pandas off. Use the openpyxl engine (install it first with pip install openpyxl) to read the file, then fill in the missing column names by forward-filling:

df = pd.read_excel('filename.xlsx', skiprows=6, engine='openpyxl')
# Fill Unnamed columns with the last non-empty column name to the left
df.columns = df.columns.fillna(method='ffill')

二、Converting Your Data to JSON

Once you’ve got a clean DataFrame with all the right column names, converting to JSON is straightforward. Here are the most useful formats:

1. Column-oriented JSON (default)

This format groups data by column names, great if you want to keep the structure close to your DataFrame:

# indent=4 makes the JSON human-readable
df.to_json('output_columns.json', orient='columns', indent=4)

2. Record-oriented JSON

Each row becomes a separate JSON object—perfect for APIs or datasets where you want individual entries:

df.to_json('output_records.json', orient='records', indent=4)

3. Index-included JSON

If you need to keep the DataFrame’s index as part of the JSON output:

df.to_json('output_with_index.json', orient='index', indent=4)

内容的提问来源于stack exchange，提问作者Hamad

火山引擎最新活动

方舟 Coding Plan

HOT

模型自由，工具不限，免费解锁 ArkClaw，7*24 小时在线的专属智能伙伴

查看详情

一键部署 OpenClaw

分钟级部署，云服务器包月低至￥9.9，与 CodingPlan 组合购买仅需19.8元

查看详情

Seedance2.0 体验中心上线

注册即享免费500万Tokens，抢先领略新一代AI视频技术跃迁

查看详情

新用户特惠专场

大模型19元起，Al应用9.9元畅享，新人首购爆款尽享优惠

查看详情

ArkClaw 专属智能伙伴