You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

使用Pandas合并Excel文件时0和1被转为True/False的问题求助

Fixing 0/1 to Boolean Conversion When Merging Excel Sheets with Pandas

Hey there! It’s super frustrating when your numeric 0s and 1s get turned into True/False during the merge process—let’s get this sorted out.

Why This Happens

Pandas has an automatic type inference system that sometimes mistakes columns full of 0s and 1s for boolean values, especially if your Excel sheets have cells formatted in a way that hints at boolean data (even if you see 0/1 displayed). Your hunch about pd.ExcelFile is on point—this is where the type detection happens when parsing sheets.

Solutions to Fix the Conversion

Here are a few targeted fixes you can apply directly in your code:

1. Force Specific Data Types When Reading Sheets

Tell pandas exactly what data type to use for columns that have 0s/1s, so it doesn’t auto-convert to booleans.

If you know the exact column names (e.g., a column named Status), specify its dtype as int when parsing:

import pandas as pd
import glob

xlsx_files = glob.glob("*.xlsx")
combined_df = pd.DataFrame()

for workbook in xlsx_files:
    excel_file = pd.ExcelFile(workbook)
    for sheet_name in excel_file.sheet_names:
        # Specify dtype for the problematic column(s)
        df = excel_file.parse(sheet_name, dtype={"Status": int})
        combined_df = pd.concat([combined_df, df], ignore_index=True)

2. Preserve All Raw Types Temporarily

If you’re not sure which columns have the issue, use dtype=object to keep all data in its raw format during reading. You can convert other columns to their correct types later:

for workbook in xlsx_files:
    excel_file = pd.ExcelFile(workbook)
    for sheet_name in excel_file.sheet_names:
        # Keep all columns as object type to avoid auto-conversion
        df = excel_file.parse(sheet_name, dtype=object)
        # Convert the 0/1 column back to int if needed
        df["Status"] = df["Status"].astype(int)
        combined_df = pd.concat([combined_df, df], ignore_index=True)

3. Use Converters for Flexible Handling

If some cells have string-formatted 0s/1s (e.g., "0" instead of numeric 0), use a converter function to standardize them to integers:

def convert_flag(value):
    # Handle both numeric and string 0/1 values
    return int(value) if str(value) in ["0", "1", 0, 1] else value

for workbook in xlsx_files:
    excel_file = pd.ExcelFile(workbook)
    for sheet_name in excel_file.sheet_names:
        df = excel_file.parse(sheet_name, converters={"Status": convert_flag})
        combined_df = pd.concat([combined_df, df], ignore_index=True)

4. Check Excel Cell Formatting (Quick Pre-Fix)

Double-check your Excel files—sometimes cells with 0/1 are actually formatted as boolean values (Excel just displays them as 0/1). To fix this:

  • Select the column in Excel
  • Go to Home > Number Format and choose "Number" instead of "Boolean"

Post-Merge Fix (If You Already Have Boolean Columns)

If you’ve already merged the data and need to convert True/False back to 1/0, run this on the affected column:

combined_df["Status"] = combined_df["Status"].astype(int)

内容的提问来源于stack exchange,提问作者Edward Mordechay

火山引擎 最新活动