You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

Pandas读取Excel文件时日期字段出现日/月反转错误的技术求助

Fixing Pandas Excel Date Parsing Issue: Day/Month Swapping After Identical Day-Month Values

Oh man, I’ve run into this exact frustrating bug before! The core issue here is that the Excel parsing engine Pandas uses (like xlrd or openpyxl) gets confused by ambiguous date formats—especially when it encounters a date where the day and month values are identical (like 10/10/2021). After that point, it incorrectly switches its parsing logic, swapping day and month for subsequent dates.

Here are the most reliable fixes to resolve this:

1. Specify the Exact Date Format Directly

Tell Pandas exactly what format your dates are in, so it doesn’t have to guess. For example, if your original dates follow the DD/MM/YYYY HH:MM:SS format, use this code:

df = pd.read_excel(
    "report_file.xls",
    parse_dates=['operation_date'],
    date_parser=lambda x: pd.to_datetime(x, format='%d/%m/%Y %H:%M:%S')
)

If you’re unsure of the exact format, first read the column as a string and check a few sample values to confirm the pattern, then adjust the format parameter accordingly.

2. Read as String First, Then Convert

If the first method doesn’t work, bypass the automatic date parsing entirely by reading the column as a string, then convert it to datetime explicitly:

# Read the column as string to avoid initial misparsing
df = pd.read_excel("report_file.xls", dtype={'operation_date': str})

# Convert to datetime with the correct format
df['operation_date'] = pd.to_datetime(df['operation_date'], format='%d/%m/%Y %H:%M:%S')

This approach gives you full control over the conversion logic, eliminating any guesswork from the Excel parser.

3. Standardize Cell Formats in Excel First

Sometimes the problem originates in the Excel file itself: some date cells might have inconsistent formatting (e.g., some use MM/DD, others use DD/MM). Open your Excel file, select the operation_date column, set a uniform date format (like yyyy-mm-dd hh:mm:ss), save the file, then try reading it with Pandas again. This can make the parsing process much more reliable.

Why This Happens

Pandas relies on third-party libraries to parse Excel dates. When these libraries encounter ambiguous date formats, they infer the pattern from the first few rows. But when they hit a date where day and month are the same (like 10/10), it breaks their inference logic, leading to incorrect swapping for all subsequent dates. By specifying the format explicitly, you override this flawed automatic inference.

内容的提问来源于stack exchange,提问作者pavlinbl4

火山引擎 最新活动