使用Pandas合并Excel文件时0和1被转为True/False的问题求助
Hey there! It’s super frustrating when your numeric 0s and 1s get turned into True/False during the merge process—let’s get this sorted out.
Why This Happens
Pandas has an automatic type inference system that sometimes mistakes columns full of 0s and 1s for boolean values, especially if your Excel sheets have cells formatted in a way that hints at boolean data (even if you see 0/1 displayed). Your hunch about pd.ExcelFile is on point—this is where the type detection happens when parsing sheets.
Solutions to Fix the Conversion
Here are a few targeted fixes you can apply directly in your code:
1. Force Specific Data Types When Reading Sheets
Tell pandas exactly what data type to use for columns that have 0s/1s, so it doesn’t auto-convert to booleans.
If you know the exact column names (e.g., a column named Status), specify its dtype as int when parsing:
import pandas as pd import glob xlsx_files = glob.glob("*.xlsx") combined_df = pd.DataFrame() for workbook in xlsx_files: excel_file = pd.ExcelFile(workbook) for sheet_name in excel_file.sheet_names: # Specify dtype for the problematic column(s) df = excel_file.parse(sheet_name, dtype={"Status": int}) combined_df = pd.concat([combined_df, df], ignore_index=True)
2. Preserve All Raw Types Temporarily
If you’re not sure which columns have the issue, use dtype=object to keep all data in its raw format during reading. You can convert other columns to their correct types later:
for workbook in xlsx_files: excel_file = pd.ExcelFile(workbook) for sheet_name in excel_file.sheet_names: # Keep all columns as object type to avoid auto-conversion df = excel_file.parse(sheet_name, dtype=object) # Convert the 0/1 column back to int if needed df["Status"] = df["Status"].astype(int) combined_df = pd.concat([combined_df, df], ignore_index=True)
3. Use Converters for Flexible Handling
If some cells have string-formatted 0s/1s (e.g., "0" instead of numeric 0), use a converter function to standardize them to integers:
def convert_flag(value): # Handle both numeric and string 0/1 values return int(value) if str(value) in ["0", "1", 0, 1] else value for workbook in xlsx_files: excel_file = pd.ExcelFile(workbook) for sheet_name in excel_file.sheet_names: df = excel_file.parse(sheet_name, converters={"Status": convert_flag}) combined_df = pd.concat([combined_df, df], ignore_index=True)
4. Check Excel Cell Formatting (Quick Pre-Fix)
Double-check your Excel files—sometimes cells with 0/1 are actually formatted as boolean values (Excel just displays them as 0/1). To fix this:
- Select the column in Excel
- Go to Home > Number Format and choose "Number" instead of "Boolean"
Post-Merge Fix (If You Already Have Boolean Columns)
If you’ve already merged the data and need to convert True/False back to 1/0, run this on the affected column:
combined_df["Status"] = combined_df["Status"].astype(int)
内容的提问来源于stack exchange,提问作者Edward Mordechay




