You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

如何用Python的TextBlob结合Pandas实现评论的情感分类与情绪识别

使用TextBlob + Pandas实现评论情感与情绪分析

嗨,我来帮你搞定这个需求!咱们分两步来实现,先搞定情感极性标注,再处理细粒度的情绪识别。先提前说下:TextBlob本身没有内置的愤怒、悲伤这类细粒度情绪识别功能,所以我们会结合关键词匹配来补充这部分能力。

准备工作

首先得安装依赖库,然后下载TextBlob需要的语料:

pip install pandas textblob
import textblob
textblob.download_corpora()  # 下载语料库,只需要运行一次

第一步:新增sentiment列(正面/负面标注)

TextBlob的sentiment.polarity会返回一个-1到1之间的数值,我们可以用阈值来划分情感:

  • 极性>0:标记为Positive(正面)
  • 极性<0:标记为Negative(负面)
  • 极性=0:标记为Neutral(中性)

直接上代码:

import pandas as pd
from textblob import TextBlob

# 用你的Comments_Final DataFrame替换下面的示例数据
sample_comments = [
    "Fit good fast shipping",
    "Product as described and functioned perfectly.",
    "this product doesn't fit my Remington rm1415 it is way to long and much larger chain..... looks like it would be a pain to return to Canada to sender",
    "Would have given it 5 stars but it is not a sealed battery",
    "Was not told I needed to sign to receive item missed delivery, made contact with carrier , then received item next day!",
    "Quick delivery. Part as expected"
]
df = pd.DataFrame({"Comments": sample_comments})  # 这里替换成你的df = pd.read_csv("你的文件路径")之类的加载方式

# 定义情感标注函数
def get_sentiment(text):
    analysis = TextBlob(text)
    if analysis.sentiment.polarity > 0:
        return "Positive"
    elif analysis.sentiment.polarity < 0:
        return "Negative"
    else:
        return "Neutral"

# 新增sentiment列
df["sentiment"] = df["Comments"].apply(get_sentiment)

第二步:新增emotion列(识别愤怒、悲伤等情绪)

因为TextBlob没有直接识别细粒度情绪的能力,我们可以构建一个情绪关键词库,通过匹配评论中的关键词来判断情绪。你可以根据自己的评论场景随时扩充这个词库:

# 自定义情绪关键词库(可以根据实际评论内容扩展)
emotion_keywords = {
    "Joy": ["good", "perfect", "quick", "fast", "expected", "great", "5 stars"],
    "Anger": ["doesn't fit", "pain", "not told", "missed delivery", "terrible"],
    "Disappointment": ["not sealed", "way too long", "larger", "would have given 5 stars but"],
    "Neutral": ["as described", "functioned"]
}

def get_emotion(text):
    text_lower = text.lower()
    matched_emotions = []
    # 遍历每个情绪对应的关键词,匹配到就记录
    for emotion, keywords in emotion_keywords.items():
        for keyword in keywords:
            if keyword in text_lower:
                matched_emotions.append(emotion)
                break  # 每个情绪只匹配一次,避免重复标记
    # 如果没有匹配到任何情绪,返回中性
    return ", ".join(matched_emotions) if matched_emotions else "Neutral"

# 新增emotion列
df["emotion"] = df["Comments"].apply(get_emotion)

查看最终结果

运行完代码后,用print(df)就能看到新增列后的DataFrame,示例输出大概是这样:

Comments sentiment               emotion
0                              Fit good fast shipping  Positive                   Joy
1         Product as described and functioned perfectly.  Positive  Joy, Neutral
2  this product doesn't fit my Remington rm1415 i...  Negative  Anger, Disappointment
3  Would have given it 5 stars but it is not a se...  Negative        Disappointment
4  Was not told I needed to sign to receive item ...  Negative                 Anger
5                Quick delivery. Part as expected  Positive  Joy, Neutral

小提示

  • 你可以根据评论的实际风格,不断扩充emotion_keywords里的关键词,提升识别准确率;
  • 如果需要更精准的情绪识别,可以考虑结合VADER这类专门针对社交媒体/评论的情感工具(不过如果必须只用TextBlob的话,关键词方案是最直接的);
  • TextBlob的极性判断基于预训练模型,遇到复杂句式可能有偏差,你可以根据实际测试调整极性阈值。

内容的提问来源于stack exchange,提问作者shantam malgaonkar

火山引擎 最新活动