You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

如何用Pandas读取含嵌套列的Excel文件并解决列名异常问题

解决Pandas读取Excel列名显示Unnamed的问题及转JSON方法

Hey there, let’s figure out why you’re getting those annoying Unnamed: column names and fix it, plus walk through how to convert your cleaned data to JSON afterward.

一、Fixing the Unnamed Column Issue

The most common reason you’re seeing Unnamed: columns is that your Excel’s header isn’t a single, clean row—maybe it’s split across multiple rows, uses merged cells, or skiprows=6 isn’t targeting the right starting point for your actual column names. Here are a few ways to fix this:

1. First, map out your Excel’s structure

Before diving into fixing, let’s see exactly what’s in those top rows. Skip setting headers entirely to inspect the raw data:

import pandas as pd
# Read the entire file without setting headers
raw_df = pd.read_excel('filename.xlsx')
# Print the first 10 rows to spot where your actual column names live
print(raw_df.head(10))

Once you find which rows make up your header (e.g., rows 7 and 8 if you skipped 6, which are index 6 and 7 in zero-based numbering), use the header parameter to specify that range:

# Use a list to tell Pandas which rows form the header
df = pd.read_excel('filename.xlsx', header=[6, 7])
# If you end up with multi-level headers, you can merge them into single names
df.columns = ['_'.join(str(col).strip() for col in col_tuple) for col_tuple in df.columns]
# Clean up any leftover "Unnamed" parts
df.columns = [col.replace('_Unnamed', '') for col in df.columns]

2. Manually define column names

If your header is super messy (like scattered across non-consecutive rows), skip past all header rows and set the column names yourself:

# Skip all header rows (adjust skiprows to start at your first data row)
df = pd.read_excel('filename.xlsx', skiprows=7)
# Create a list of your actual column names, in the same order as the Excel columns
custom_cols = ['Customer ID', 'Order Date', 'Product Name', ...]  # Replace with your real names
df.columns = custom_cols

3. Handle merged cells with openpyxl

Merged cells in Excel often throw Pandas off. Use the openpyxl engine (install it first with pip install openpyxl) to read the file, then fill in the missing column names by forward-filling:

df = pd.read_excel('filename.xlsx', skiprows=6, engine='openpyxl')
# Fill Unnamed columns with the last non-empty column name to the left
df.columns = df.columns.fillna(method='ffill')

二、Converting Your Data to JSON

Once you’ve got a clean DataFrame with all the right column names, converting to JSON is straightforward. Here are the most useful formats:

1. Column-oriented JSON (default)

This format groups data by column names, great if you want to keep the structure close to your DataFrame:

# indent=4 makes the JSON human-readable
df.to_json('output_columns.json', orient='columns', indent=4)

2. Record-oriented JSON

Each row becomes a separate JSON object—perfect for APIs or datasets where you want individual entries:

df.to_json('output_records.json', orient='records', indent=4)

3. Index-included JSON

If you need to keep the DataFrame’s index as part of the JSON output:

df.to_json('output_with_index.json', orient='index', indent=4)

内容的提问来源于stack exchange,提问作者Hamad

火山引擎 最新活动