You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

TimescaleDB数据降采样配置咨询:替代RRD工具的分层数据保留方案

How to Implement RRD-Style Downsampling & Retention in TimescaleDB

Hey there! Great question—TimescaleDB has purpose-built tools to handle exactly this kind of time-series data lifecycle management, so you don’t need to build custom cleanup or aggregation tasks from scratch. Let’s walk through how to set up your desired retention and downsampling workflow:

Key Concepts to Use

TimescaleDB relies on two core features for this: Continuous Aggregates (for downsampling old data into coarser averages) and Retention Policies (for automatically removing raw data once it’s no longer needed). You can layer these to create your multi-resolution storage strategy.

Step 1: Ensure Your Table is a Hypertable

First, make sure your time-series table is a hypertable (TimescaleDB’s partitioned table optimized for time-series data). If it’s not already, convert it with:

SELECT create_hypertable('your_metric_table', 'timestamp_column');

Replace your_metric_table with your actual table name, and timestamp_column with the column storing your timestamps.

Step 2: Create Layered Continuous Aggregates

You’ll create continuous aggregates to roll up your data into coarser resolutions over time. Let’s build your desired tiers:

30/60-Minute Average Aggregate (for data older than 90 days)

This view will store 30 or 60-minute averages of your raw data, which you’ll keep after the raw data is deleted. Swap '30 minutes' for '60 minutes' if that’s your preference:

CREATE MATERIALIZED VIEW metric_30min_avg
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('30 minutes', timestamp_column) AS bucket_time,
  -- Include any grouping columns (e.g., device ID, sensor ID) here
  device_id,
  AVG(metric_value) AS avg_metric
FROM your_metric_table
GROUP BY bucket_time, device_id;

Daily Average Aggregate (for long-term storage)

Next, create a higher-level aggregate that rolls up the 30/60-minute data into daily averages for years-long retention:

CREATE MATERIALIZED VIEW metric_daily_avg
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 day', bucket_time) AS day_bucket,
  device_id,
  AVG(avg_metric) AS daily_avg_metric
FROM metric_30min_avg
GROUP BY day_bucket, device_id;

Step 3: Set Up Retention Policies

Now define policies to automatically remove data once it’s been rolled up to the next tier:

Retain Raw Data for 90 Days

Delete raw data after 90 days, since we’ll have the 30/60-minute aggregates instead:

SELECT add_retention_policy('your_metric_table', INTERVAL '90 days');

Retain 30/60-Minute Aggregates Until They’re Rolled to Daily Averages

Once data is converted to daily averages, you can remove the 30/60-minute aggregates. Adjust the interval to match how long you want to keep this tier (e.g., 1 year):

SELECT add_retention_policy('metric_30min_avg', INTERVAL '1 year');

Step 4: Automate Aggregate Refreshes

Add policies to automatically update your continuous aggregates so they stay in sync with new data:

Refresh 30/60-Minute Aggregate

This policy will refresh the 30/60-minute aggregate for data that’s older than 90 days (so we only aggregate data that’s no longer being modified) on an hourly schedule:

SELECT add_continuous_aggregate_policy('metric_30min_avg',
  start_offset => INTERVAL '91 days',
  end_offset => INTERVAL '1 hour',
  schedule_interval => INTERVAL '1 hour');

Refresh Daily Aggregate

Similarly, refresh the daily aggregate from the 30/60-minute data on a daily schedule, targeting data older than your 30/60-minute retention window:

SELECT add_continuous_aggregate_policy('metric_daily_avg',
  start_offset => INTERVAL '1 year 1 day',
  end_offset => INTERVAL '1 day',
  schedule_interval => INTERVAL '1 day');

Why This Works Better Than Custom Scripts

Unlike building your own cron jobs or cleanup tasks, these TimescaleDB features are:

  • Built-in: Managed by the database’s background job scheduler, so you don’t have to maintain external tools.
  • Flexible: Adjust intervals, retention periods, or aggregation functions (e.g., switch from AVG to MAX/MIN) anytime by altering the views or policies.
  • PostgreSQL-Compatible: All aggregates are standard PostgreSQL materialized views, so you can query them just like any other table.

内容的提问来源于stack exchange,提问作者Troy

火山引擎 最新活动