如何在Pandas DataFrame中使用多条件判断添加TIME_TYPE分类列
Hey there! Let's tackle this problem step by step. You've got a Pandas DataFrame with YEAR and MONTH columns, and need to add a TIME_TYPE column that tags rows as either "History" or "Forecast" based on your specific rules. Here are two clean, easy-to-implement solutions:
Solution 1: Using numpy.where (One-Liner Conditional Assignment)
This method is great if you prefer a concise, single-line approach for nested conditions. First, make sure you've imported both Pandas and NumPy:
import pandas as pd import numpy as np
Let's start with a sample DataFrame that matches your structure:
# Sample data matching your format df = pd.DataFrame({ 'YEAR': [2016, 2016, 2020, 2021, 2021, 2021], 'MONTH': [1, 2, 5, 6, 7, 10] })
Now add the TIME_TYPE column using nested np.where to check your two conditions:
df['TIME_TYPE'] = np.where( # First condition: YEAR ≤2021 AND MONTH <7 → "History" (df['YEAR'] <= 2021) & (df['MONTH'] < 7), 'History', # Second condition: YEAR ≥2021 AND MONTH ≥7 → "Forecast" np.where( (df['YEAR'] >= 2021) & (df['MONTH'] >= 7), 'Forecast', None # Handle edge cases that don't fit either rule ) )
Solution 2: Using pandas.loc (Step-by-Step Assignment)
If you prefer more explicit, readable code, using df.loc to assign values row-by-row is a perfect choice:
# Initialize the TIME_TYPE column (optional but avoids NaN warnings) df['TIME_TYPE'] = None # Assign "History" to rows meeting the first condition df.loc[(df['YEAR'] <= 2021) & (df['MONTH'] < 7), 'TIME_TYPE'] = 'History' # Assign "Forecast" to rows meeting the second condition df.loc[(df['YEAR'] >= 2021) & (df['MONTH'] >= 7), 'TIME_TYPE'] = 'Forecast'
Output Result
Either method will give you the exact DataFrame structure you want. Running print(df) will output:
YEAR MONTH TIME_TYPE 0 2016 1 History 1 2016 2 History 2 2020 5 History 3 2021 6 History 4 2021 7 Forecast 5 2021 10 Forecast
Note
If you have rows that don't fit either condition (like YEAR=2022 and MONTH=5), both methods will set TIME_TYPE to None. You can replace None with a value like "Unknown" if you need to handle these edge cases explicitly.
内容的提问来源于stack exchange,提问作者spartanboy




