You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

如何在Pandas DataFrame中使用多条件判断添加TIME_TYPE分类列

How to Add a TIME_TYPE Column to Pandas DataFrame Based on Year/Month Conditions

Hey there! Let's tackle this problem step by step. You've got a Pandas DataFrame with YEAR and MONTH columns, and need to add a TIME_TYPE column that tags rows as either "History" or "Forecast" based on your specific rules. Here are two clean, easy-to-implement solutions:

Solution 1: Using numpy.where (One-Liner Conditional Assignment)

This method is great if you prefer a concise, single-line approach for nested conditions. First, make sure you've imported both Pandas and NumPy:

import pandas as pd
import numpy as np

Let's start with a sample DataFrame that matches your structure:

# Sample data matching your format
df = pd.DataFrame({
    'YEAR': [2016, 2016, 2020, 2021, 2021, 2021],
    'MONTH': [1, 2, 5, 6, 7, 10]
})

Now add the TIME_TYPE column using nested np.where to check your two conditions:

df['TIME_TYPE'] = np.where(
    # First condition: YEAR ≤2021 AND MONTH <7 → "History"
    (df['YEAR'] <= 2021) & (df['MONTH'] < 7),
    'History',
    # Second condition: YEAR ≥2021 AND MONTH ≥7 → "Forecast"
    np.where(
        (df['YEAR'] >= 2021) & (df['MONTH'] >= 7),
        'Forecast',
        None  # Handle edge cases that don't fit either rule
    )
)

Solution 2: Using pandas.loc (Step-by-Step Assignment)

If you prefer more explicit, readable code, using df.loc to assign values row-by-row is a perfect choice:

# Initialize the TIME_TYPE column (optional but avoids NaN warnings)
df['TIME_TYPE'] = None

# Assign "History" to rows meeting the first condition
df.loc[(df['YEAR'] <= 2021) & (df['MONTH'] < 7), 'TIME_TYPE'] = 'History'

# Assign "Forecast" to rows meeting the second condition
df.loc[(df['YEAR'] >= 2021) & (df['MONTH'] >= 7), 'TIME_TYPE'] = 'Forecast'

Output Result

Either method will give you the exact DataFrame structure you want. Running print(df) will output:

YEAR  MONTH TIME_TYPE
0  2016      1    History
1  2016      2    History
2  2020      5    History
3  2021      6    History
4  2021      7   Forecast
5  2021     10   Forecast

Note

If you have rows that don't fit either condition (like YEAR=2022 and MONTH=5), both methods will set TIME_TYPE to None. You can replace None with a value like "Unknown" if you need to handle these edge cases explicitly.

内容的提问来源于stack exchange,提问作者spartanboy

火山引擎 最新活动