You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

Statsmodels OLS回归报错'exog contains inf or nans'但数据无空值/无穷值的原因排查求助

Statsmodels OLS回归报错'exog contains inf or nans'但数据无空值/无穷值的原因排查求助

各位大佬好,我最近在用Statsmodels做OLS回归分析时碰到了一个百思不得其解的问题:我有4个不同的因变量,打算逐个代入模型测试,前3个变量都能正常跑通回归,但轮到最后一个的时候,直接抛出了MissingDataError: exog contains inf or nans的错误。

我已经把数据翻来覆去检查了好几遍,确认没有NaN或者无穷值:

  • x.isna().sum()统计空值数量,结果全是0
  • 用这段代码专门检测无穷值和空值:
import numpy as np

if (x.isin([np.inf, -np.inf, np.nan]).any()):
    print("Series contains infinite values")
else:
    print("Series does not contain infinite values")

运行后输出的是Series does not contain infinite values,说明数据里确实不存在这些异常值。

下面是完整的报错回溯信息:

----> 5     result = sm.OLS(y,sm.add_constant(x_adj)).fit()
6     return result

~\anaconda3\lib\site-packages\statsmodels\regression\linear_model.py in __init__(self, endog, exog, missing, hasconst, **kwargs)
870     def __init__(self, endog, exog=None, missing='none', hasconst=None,
871                  **kwargs):
--> 872         super(OLS, self).__init__(endog, exog, missing=missing,
873                                   hasconst=hasconst, **kwargs)
874         if "weights" in self._init_keys:

~\anaconda3\lib\site-packages\statsmodels\regression\linear_model.py in __init__(self, endog, exog, weights, missing, hasconst, **kwargs)
701         else:
702             weights = weights.squeeze()
--> 703         super(WLS, self).__init__(endog, exog, missing=missing,
704                                   weights=weights, hasconst=hasconst, **kwargs)
705         nobs = self.exog.shape[0]

~\anaconda3\lib\site-packages\statsmodels\regression\linear_model.py in __init__(self, endog, exog, **kwargs)
188     """
189     def __init__(self, endog, exog, **kwargs):
--> 190         super(RegressionModel, self).__init__(endog, exog, **kwargs)
191         self._data_attr.extend(['pinv_wexog', 'weights'])
192

~\anaconda3\lib\site-packages\statsmodels\base\model.py in __init__(self, endog, exog, **kwargs)
235
236     def __init__(self, endog, exog=None, **kwargs):
--> 237         super(LikelihoodModel, self).__init__(endog, exog, **kwargs)
238         self.initialize()
239

~\anaconda3\lib\site-packages\statsmodels\base\model.py in __init__(self, endog, exog, **kwargs)
75         missing = kwargs.pop('missing', 'none')
76         hasconst = kwargs.pop('hasconst', None)
---> 77         self.data = self._handle_data(endog, exog, missing, hasconst,
78                                       **kwargs)
79         self.k_constant = self.data.k_constant

~\anaconda3\lib\site-packages\statsmodels\base\model.py in _handle_data(self, endog, exog, missing, hasconst, **kwargs)
99
100     def _handle_data(self, endog, exog, missing, hasconst, **kwargs):
--> 101         data = handle_data(endog, exog, missing, hasconst, **kwargs)
102         # kwargs arrays could have changed, easier to just attach here
103         for key in kwargs:

~\anaconda3\lib\site-packages\statsmodels\base\data.py in handle_data(endog, exog, missing, hasconst, **kwargs)
670
671     klass = handle_data_class_factory(endog, exog)
--> 672     return klass(endog, exog=exog, missing=missing, hasconst=hasconst,
673                  **kwargs)

~\anaconda3\lib\site-packages\statsmodels\base\data.py in __init__(self, endog, exog, missing, hasconst, **kwargs)
85         self.const_idx = None
86         self.k_constant = 0
---> 87         self._handle_constant(hasconst)
88         self._check_integrity()
89         self._cache = {}

~\anaconda3\lib\site-packages\statsmodels\base\data.py in _handle_constant(self, hasconst)
131             exog_max = np.max(self.exog, axis=0)
132             if not np.isfinite(exog_max).all():
--> 133                 raise MissingDataError('exog contains inf or nans')
134             exog_min = np.min(self.exog, axis=0)
135             const_idx = np.where(exog_max == exog_min)[0].squeeze()

MissingDataError: exog contains inf or nans

有没有大佬知道除了数据里的NaN和无穷值之外,还有哪些可能的原因会触发这个错误呀?麻烦指点一下,谢谢!

备注:内容来源于stack exchange,提问作者Kyle_Stockton

火山引擎 最新活动