使用Pandas实现Excel列匹配并插入数据的技术咨询
Got it, let's work through this problem. You want to populate the Result column in your first Excel file by matching the svc_no values with a second Excel file. Here's a straightforward way to do this with Pandas:
Step-by-Step Solution
1. Import Pandas and Load Your Files
First, we'll read both Excel files into Pandas DataFrames. Make sure to replace the file paths with your actual file names.
import pandas as pd # Load the main file (where you need to fill the Result column) df_main = pd.read_excel("your_main_file.xlsx") # Load the source file (contains the matching Result data for svc_no) df_source = pd.read_excel("your_source_file.xlsx")
2. Match Data and Fill the Result Column
You have two common approaches here—choose the one that fits your needs:
Option 1: Use merge (Best for Matching Multiple Columns)
This method is flexible if you ever need to pull more than just the Result column later. We'll do a left join to keep all rows from your main file, even if there's no matching svc_no in the source.
# Merge the two DataFrames on svc_no, keeping only the Result column from the source df_merged = pd.merge( df_main, df_source[["svc_no", "Result"]], # Adjust "Result" to your source column name if needed on="svc_no", how="left", suffixes=("", "_source") ) # Populate the Result column in the main DataFrame df_main["Result"] = df_merged["Result_source"].fillna("") # Use fillna("N/A") if you want a placeholder # Save the updated file df_main.to_excel("main_file_with_results.xlsx", index=False)
Option 2: Use map (Simpler for Single Column Matching)
If you only need to pull the Result column, this is a more concise approach. We'll create a dictionary mapping svc_no to its corresponding Result value.
# Create a mapping dictionary: svc_no -> Result result_mapping = df_source.set_index("svc_no")["Result"].to_dict() # Fill the Result column in the main DataFrame df_main["Result"] = df_main["svc_no"].map(result_mapping).fillna("") # Save the updated file df_main.to_excel("main_file_with_results.xlsx", index=False)
3. Important Notes to Avoid Issues
- Data Type Consistency: Ensure
svc_nois the same data type in both files (e.g., both strings or both integers). If not, convert them:df_main["svc_no"] = df_main["svc_no"].astype(str) df_source["svc_no"] = df_source["svc_no"].astype(str) - Duplicate
svc_no: If your source file has duplicatesvc_noentries, remove them first to avoid unexpected matches:df_source = df_source.drop_duplicates(subset="svc_no") - Missing Matches: The
fillna("")ensures anysvc_nowithout a match in the source will have an emptyResultcell. You can replace the empty string with something like"No Match"if preferred.
内容的提问来源于stack exchange,提问作者Ganzo Palumi




