Statsmodels中能否为固定效应回归模型自动计算聚类标准误差？

阿华AIGC实验室

2026-4-14

很遗憾地告诉你，目前Statsmodels的MixedLM模块并不支持直接通过cov_type="cluster"这类参数来自动计算聚类标准误差——这正是你遇到AttributeError: 'MixedLM' object has no attribute 'wexog'的核心原因。

聚类标准误差的计算逻辑是为普通OLS模型设计的，依赖OLS结果对象里的wexog等专属属性，但MixedLM（混合线性模型）的底层结构和方差估计方式和OLS完全不同：它是基于似然估计来处理组内相关性的，Statsmodels并没有把三明治估计量（聚类标准误的核心）和混合效应模型的拟合流程整合起来。

不过你不用完全手动硬编码，有两个更便捷的替代方案：

方案1：用Statsmodels手动 demean后走OLS+聚类标准误

这就是你提到的思路，我们可以用更简洁的实现方式，不用手动写demean逻辑：

import statsmodels.api as sm
from statsmodels.formula.api import ols

# 对所有核心变量做组内 demean
data_demeaned = data.groupby("name_income_type").apply(
    lambda df: df.assign(
        target=df.target - df.target.mean(),
        amt_income_total=df.amt_income_total - df.amt_income_total.mean(),
        amt_credit=df.amt_credit - df.amt_credit.mean()
    )
).reset_index(drop=True)

# 拟合OLS并指定聚类标准误
model = ols(
    "target ~ amt_income_total + amt_credit + name_housing_type",
    data=data_demeaned
)
results = model.fit(cov_type="cluster", cov_kwds={"groups": data["name_income_type"]})
print(results.summary())

这个方法本质上就是固定效应的“组内估计量”，得到的系数和固定效应模型拟合结果一致，同时能输出你需要的聚类标准误。

方案2：用专门的计量经济包linearmodels自动实现

如果你不想手动处理demean，推荐使用linearmodels——这个包就是为经济学领域的回归需求量身设计的，原生支持固定效应+聚类标准误的自动拟合：

from linearmodels import PanelOLS

# 将数据转为面板格式（用你的分组变量作为个体索引）
data = data.set_index("name_income_type")

# 直接指定个体固定效应+聚类标准误
model = PanelOLS.from_formula(
    "target ~ amt_income_total + amt_credit + name_housing_type + EntityEffects",
    data=data
)
# cluster_entity=True表示按个体（即你的name_income_type）聚类
results = model.fit(cov_type="clustered", cluster_entity=True)
print(results.summary())

这个方法完全不需要手动处理demean，包会自动帮你完成固定效应的拟合和聚类标准误的计算，结果完全符合经济学文献中的标准做法。

总结一下：Statsmodels的MixedLM目前确实没有内置支持固定效应+聚类标准误的功能，但通过手动demean后用OLS，或者使用linearmodels这类专门的计量包，都能轻松实现你的需求，不需要完全硬编码。

备注：内容来源于stack exchange，提问作者Filipe P.