You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

如何用Pandas的Rolling计算DataFrame的真实滚动平均值?

Calculating True Rolling Average for All Values in a Pandas DataFrame

Alright, let's break down how to compute the true rolling average across all values in your DataFrame using Pandas' rolling functionality. I'll use a concrete example to make this clear.

Step 1: Example DataFrame

First, let's define the sample DataFrame we're working with:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'a001': [1, np.nan, np.nan, np.nan, np.nan, 2, np.nan],
    'a002': [1, 7, np.nan, 3, np.nan, 2, 6]
})

Which looks like this when printed:

a001a002
011
1NaN7
2NaNNaN
3NaN3
4NaNNaN
522
6NaN6

Step 2: Core Calculation Logic

To get the true rolling average (with a window size of 2 rows), we need to:

  1. First calculate the average of valid values (ignoring NaN) for each individual row
  2. Then apply a rolling window average to those row-wise means

Here's the code to do this:

df['rolling_mean'] = df.mean(axis=1).rolling(window=2, min_periods=1).mean()

Step 3: Result

After running the code, your DataFrame will look like this:

a001a002rolling_mean
01.01.01.0
1NaN7.04.0
2NaNNaN7.0
3NaN3.03.0
4NaNNaN3.0
52.02.02.0
6NaN6.04.0

Let's Break Down the Code

  • df.mean(axis=1): Computes the average of each row, automatically skipping NaN values. For rows with all NaN, this returns NaN.
  • .rolling(window=2, min_periods=1): Sets up a rolling window of 2 rows. The min_periods=1 parameter ensures we still get a result even if only one valid value exists in the window (critical for handling rows with all NaN).
  • .mean(): Calculates the average of values within each rolling window.

内容的提问来源于stack exchange,提问作者Joe

火山引擎 最新活动