You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

如何在Python中通过匹配No.列,用另一个DataFrame填充目标DataFrame的NaN值?

How to Fill NaN Values in df1's Column A with Matching Values from df2 Using the "No." Column

Ah, I get it—your original code didn't work because fillna() matches values by row index, not by the No. column you care about. Let's fix that with a few straightforward approaches:

Method 1: Use a Mapping Dictionary (Simplest)

First, create a dictionary that maps No. values from df2 to their corresponding A values. Then use this mapping to fill the NaNs in df1's A column:

# Create a No. -> A mapping from df2
a_mapping = df2.set_index('No.')['A'].to_dict()

# Fill NaNs in df1's A column using the mapping
df1['A'] = df1['A'].fillna(df1['No.'].map(a_mapping))

This works because df1['No.'].map(a_mapping) looks up each No. in df1 against the dictionary, returning the matching A value from df2. fillna() then replaces only the NaN entries in df1's A column with these matched values.

Method 2: Use Left Merge

If you prefer working with merges, perform a left join between df1 and df2 (keeping all rows from df1), then combine the A columns:

# Left join df1 with df2's No. and A columns (avoid duplicate column names)
merged = df1.merge(df2[['No.', 'A']], on='No.', how='left', suffixes=('', '_df2'))

# Fill NaNs in the original A column with values from df2
df1['A'] = merged['A'].fillna(merged['A_df2'])

# Clean up the temporary column
df1 = df1.drop(columns=['A_df2'])

This ensures you only pull in the A values from df2 that match the No. values in df1, then merge them into your original dataframe.

Method 3: Use combine_first (Index-Based Matching)

If you set No. as the index for both dataframes, combine_first will automatically fill NaNs in df1 with matching values from df2:

# Set No. as the index for both dataframes
df1_indexed = df1.set_index('No.')
df2_indexed = df2.set_index('No.')

# Fill NaNs in df1 using df2's values (preserve df1's original col1/col2/col3)
df1_updated = df1_indexed[['col1', 'col2', 'col3', 'A']].combine_first(df2_indexed[['A']]).reset_index()

# If you want to overwrite the original df1:
df1 = df1_updated

combine_first prioritizes values from df1, only replacing NaNs with the corresponding values from df2 (now aligned by the No. index).

After running any of these methods, your df1 will look exactly like the desired output:

No.  col1  col2  col3    A
0  123     2     5     2  1.0
1  453     4     3     1  3.0
2  146     7     9     4  2.0
3  175     2     4     3  3.0
4  643     0     0     0  2.0

内容的提问来源于stack exchange,提问作者K saman

火山引擎 最新活动