如何通过线性插值填充连续NaN值?技术咨询
Can Linear Interpolation Be Used to Fill Consecutive NaN Values?
Absolutely! This linear interpolation approach for filling consecutive NaN values is totally doable, and it’s actually a standard technique in data cleaning and preprocessing. It perfectly addresses your need to replace sequences of missing values with evenly spaced values between the nearest non-NaN endpoints.
How It Works
Linear interpolation calculates values between two known points by assuming a straight-line relationship. For a sequence like [1, nan, nan, 4]:
- The difference between the endpoints is
4 - 1 = 3 - There are 2 consecutive NaNs, creating 3 equal intervals between the four positions
- Each interval step is
3 / 3 = 1, so the missing values become1 + 1 = 2and2 + 1 = 3
Practical Implementation (Python Pandas)
Pandas has a built-in interpolate() method that makes this trivial with the method='linear' parameter. Here’s how to apply it to your examples:
import pandas as pd import numpy as np # Example 1: Increasing sequence with consecutive NaNs series_1 = pd.Series([1, np.nan, np.nan, 4]) filled_series_1 = series_1.interpolate(method='linear') print(filled_series_1.tolist()) # Output: [1.0, 2.0, 3.0, 4.0] # Example 2: Decreasing sequence with consecutive NaNs series_2 = pd.Series([8, np.nan, np.nan, 2]) filled_series_2 = series_2.interpolate(method='linear') print(filled_series_2.tolist()) # Output: [8.0, 6.0, 4.0, 2.0]
Key Notes
- This method works for any number of consecutive NaNs between two non-missing values, not just two. For example,
[5, nan, nan, nan, 13]would get filled to[5, 7, 9, 11, 13]. - If your data is in a DataFrame instead of a Series, you can use the same method and specify the
axisparameter to interpolate along rows (axis=1) or columns (axis=0, default). - NaNs at the start or end of the sequence won’t be filled by linear interpolation (since there’s only one endpoint to reference). If you need to handle those, you can combine this with forward/backward filling (
ffill()/bfill()) as needed.
内容的提问来源于stack exchange,提问作者vuvu




