Pandas DataFrame条件赋值报错：'str'与'int'无法比较

阿华AIGC实验室

2026-5-21

Fixing the TypeError When Adding a 'Class' Column in Pandas

Hey there! Let's break down what's going wrong and how to fix this issue step by step.

The Root Cause of Your Error

That TypeError: '>' not supported between instances of 'str' and 'int' pops up because the numeric columns in your DataFrame (discount, tax, total) are stored as string/object types, not actual numbers. When you try to compare a string like "46.49" to an integer like 20, Python can't make sense of that comparison—hence the error.

Step-by-Step Solution

1. First, Confirm Your Column Types

Let's verify the data types of your columns to be sure:

import pandas as pd

# Recreate your sample DataFrame (matching the string-type scenario that causes the error)
data = {
    'discount': ['3', '10', '46.49'],
    'tax': ['0', '3', '6'],
    'total': ['20', '106', '21'],
    'subtotal': ['13', '94', '20'],
    'productid': ['002', '003', '004']
}
df = pd.DataFrame(data)

# Check column data types
print(df.dtypes)

You'll see that discount, tax, total, and subtotal are listed as object (Pandas' way of saying string).

2. Convert Columns to Numeric Types

We'll use pd.to_numeric() to turn these columns into proper numeric values. The errors='coerce' argument will convert any unconvertible values to NaN (you can handle those later if needed):

# Convert relevant columns to numeric types
df['discount'] = pd.to_numeric(df['discount'], errors='coerce')
df['tax'] = pd.to_numeric(df['tax'], errors='coerce')
df['total'] = pd.to_numeric(df['total'], errors='coerce')
df['subtotal'] = pd.to_numeric(df['subtotal'], errors='coerce')

# Verify the conversion worked
print(df.dtypes)

Now your columns will show as float64 or int64, ready for numeric comparisons.

3. Add the 'Class' Column

You have two reliable options here—either using your original custom function approach (now that types are fixed) or using vectorized operations (the faster, more Pandas-idiomatic method).

Option 1: Custom Function with `apply()`

Now that your columns are numeric, your original function will run without errors:

def assign_class(row):
    if row['discount'] > 20 and row['total'] > 100 and row['tax'] == 0:
        return 1
    else:
        return 0

df['Class'] = df.apply(assign_class, axis=1)

Option 2: Vectorized Operations (Recommended)

Pandas is built for vectorized calculations, which are way faster than looping through each row with apply()—especially for large datasets. Here's how to implement it:

# Combine all conditions with & (bitwise AND), then convert boolean results to integers (True=1, False=0)
df['Class'] = ((df['discount'] > 20) & (df['total'] > 100) & (df['tax'] == 0)).astype(int)

Final Result

For your sample data, none of the rows meet all three conditions (discount>20, total>100, tax==0), so the Class column will be 0 for all rows—exactly what we'd expect!

内容的提问来源于stack exchange，提问作者Abdul Rehman