You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

Pandas:将浮点数列转换为整数同时保留NaN值

Convert Float Column with NaNs to Integer Column (Preserving NaNs)

Hey there! Let's work through this problem you're facing: converting a float column with NaN values to integers without losing the NaNs. The errors you're hitting are totally expected, so let's break down why they happen and what we can do instead.

Why Your Initial Attempts Failed

  • Using astype('int'): Standard integer types in pandas don't support NaN values—they're strictly for whole numbers with no missing data. That's why you get the ValueError when NaNs are present.
  • Using astype(pd.Int64Dtype()): Pandas' nullable integer type (Int64) is designed to handle NaNs, but it requires that your float values are exactly equivalent to integers (like 5.0 works, but 5.3 doesn't). The TypeError pops up because pandas won't silently convert non-integer floats to integers (to avoid accidental data loss).

Solutions to Fix This

Option 1: Clean Non-Integer Floats First (Round/Truncate)

If your dataset has non-integer floats (like 5.3 or 1.2), you'll need to convert those to whole numbers first before using the nullable integer type. Choose rounding or truncation based on your needs:

import pandas as pd
import numpy as np

# Your original data
df = pd.DataFrame({'val1': [5.3, np.nan, 2.0, 1.2, 5.0]})

# Option A: Round to the nearest integer
df['val1'] = df['val1'].round().astype(pd.Int64Dtype())

# Option B: Truncate decimal places (floor the value)
# df['val1'] = np.floor(df['val1']).astype(pd.Int64Dtype())

After running this, your DataFrame will look like this:

val1
0     5
1  <NA>
2     2
3     1
4     5

Option 2: Use pd.to_numeric for Safe Conversion

If most of your floats are already integer-like and you want to handle edge cases gracefully, use pd.to_numeric with the downcast parameter. This will safely convert eligible values to nullable integers and leave others as-is if needed:

df['val1'] = pd.to_numeric(df['val1'], downcast='integer', errors='ignore')

Option 3: Auto-Infer Nullable Types with convert_dtypes

Pandas' convert_dtypes method automatically detects the best nullable type for each column. It'll convert integer-like floats to Int64 (keeping NaNs) and leave non-integer floats as float64—great for mixed datasets:

df = df.convert_dtypes()

内容的提问来源于stack exchange,提问作者Christian O.

火山引擎 最新活动