You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

使用Python调用KuCoin API获取2019年1分钟K线数据时遭遇429请求超限错误的排查与优化咨询

Hey there, let's tackle your two questions one by one, starting with the issues in your code since that's likely the root of the unexpected 429 errors.

1. Issues in Your Code

Let's break down the problems causing the rate limit hits and inefficiencies:

a. Flawed Loop Logic & Request Range

Your get_ku_hp function repeatedly requests data from the fixed str_start (2019-01-05) to the temp_start (the earliest timestamp from the previous request). While KuCoin's API truncates responses to 1500 entries, this approach is inefficient and risky:

  • Each call asks for a massive time range, forcing the API to do extra work to truncate to the latest 1500 entries.
  • Worse, your loop condition while str_start < temp_start will run hundreds of times to reach 2019. When a request fails (like hitting 429), the time.sleep(10) is skipped (since it’s inside the try block), leading to back-to-back requests that trigger stricter rate limits.

b. Missing Retry & Backoff for 429 Errors

When you hit a KucoinAPIException(<Response [429]>), your code only prints the error and immediately loops again. This means you’re retrying without waiting, which will only extend the rate limit restriction. KuCoin uses a sliding 10-second window for limits, so spamming requests after a 429 makes the problem worse.

c. Inefficient Timestamp Direction

Your approach to fetching older data by setting end_ts to the previous batch’s earliest timestamp is backwards. Instead of asking for a huge range and letting the API truncate, you should calculate exact start/end times for each batch to fetch contiguous, non-overlapping data efficiently.


2. How to Fetch Large Historical Data Friendly to the API

Here’s a revised approach to stay within rate limits and efficiently pull all your 2019 1-minute data:

a. Batch Requests Precisely

Since 1-minute K-lines have a 1500-entry limit per request, each batch covers exactly 1500 minutes (25 hours). Calculate the start and end timestamps for each batch so you fetch contiguous, non-overlapping data every time. This avoids wasting API resources on large, truncated requests.

b. Respect Sliding Window Rate Limits

KuCoin allows 30 requests per 10-second sliding window. Instead of sleeping 10 seconds per request (overly conservative), you can batch up to 25 requests (to leave a buffer) then sleep 10 seconds. A small 0.4-second sleep between requests also helps avoid hitting the sliding window edge.

c. Add Exponential Backoff for Retries

When hitting a 429 error, wait for an increasing amount of time (e.g., 2s → 4s → 8s, capped at 60s) before retrying. This gives the API time to reset your rate limit counter.

d. Revised Code Implementation

Here’s a fixed version incorporating these best practices:

import time
import datetime
from datetime import timezone
import pandas as pd
from kucoin.client import Client  # Assuming you're using the official SDK

def get_ku_his(sym, tf, start_ts, end_ts):
    # Fetch one precise batch of up to 1500 entries
    data = kuclient.get_kline_data(sym, tf, start_ts, end_ts)
    if not data:
        return pd.DataFrame()
    
    df = pd.DataFrame(data, columns=['timestamp', 'open', 'close', 'high', 'low', 'transaction amount', 'volume'])
    df['timestamp'] = pd.to_datetime(df['timestamp'].astype(float)*1000, unit='ms')
    df.set_index('timestamp', inplace=True)
    df = df[["close","volume"]].dropna().astype(float).sort_index()
    return df

def get_ku_hp(sym, tf, str_start, str_end=None):
    # Convert start/end times to UTC datetimes
    start_dt = datetime.datetime.strptime(str_start, '%Y-%m-%d %H:%M').replace(tzinfo=timezone.utc)
    current_end_dt = datetime.datetime.now(timezone.utc) if str_end is None else datetime.datetime.strptime(str_end, '%Y-%m-%d %H:%M').replace(tzinfo=timezone.utc)
    
    # Calculate batch interval: 1500 minutes for 1min timeframe
    batch_minutes = 1500 if tf == '1min' else 1  # Adjust for other TFs
    batch_timedelta = datetime.timedelta(minutes=batch_minutes)
    
    all_data = pd.DataFrame()
    current_start_dt = current_end_dt - batch_timedelta
    request_count = 0
    max_requests_per_window = 25  # Stay under KuCoin's 30/10s limit
    window_reset_time = time.time() + 10

    while current_start_dt >= start_dt:
        try:
            # Convert datetimes to Unix timestamps (seconds)
            start_ts = int(current_start_dt.timestamp())
            end_ts = int(current_end_dt.timestamp())
            
            batch_df = get_ku_his(sym, tf, start_ts, end_ts)
            if not batch_df.empty:
                all_data = pd.concat([all_data, batch_df])
                print(f"Downloaded batch: {current_start_dt.strftime('%Y-%m-%d %H:%M')} to {current_end_dt.strftime('%Y-%m-%d %H:%M')}")
            
            # Move to the next older batch
            current_end_dt = current_start_dt
            current_start_dt = current_end_dt - batch_timedelta
            
            # Manage rate limits
            request_count += 1
            if request_count >= max_requests_per_window:
                time_left = window_reset_time - time.time()
                if time_left > 0:
                    time.sleep(time_left)
                # Reset window counter
                request_count = 0
                window_reset_time = time.time() + 10
            else:
                # Small buffer between requests
                time.sleep(0.4)
        
        except Exception as e:
            if isinstance(e, Client.KucoinAPIException) and e.response.status_code == 429:
                # Exponential backoff for rate limits
                wait_time = min(2 ** request_count, 60)
                print(f"Hit rate limit, retrying after {wait_time} seconds...")
                time.sleep(wait_time)
            else:
                print(f"Unexpected error: {repr(e)}")
                break
    
    # Final cleanup
    all_data = all_data.sort_index().drop_duplicates()
    print(f"{datetime.today().strftime('%H:%M:%S')} Downloaded {tf} data for {sym} from {all_data.index.min()} to {all_data.index.max()}")
    return all_data

# Example: Fetch full 2019 1min data for AIOZ-USDT
test = get_ku_hp("AIOZ-USDT","1min","2019-01-01 00:00", "2019-12-31 23:59")

Key Improvements:

  • Precise Batches: Each request fetches exactly 25 hours of data, no wasted API calls.
  • Sliding Window Control: Tracks request counts to stay well under the 30-request limit.
  • Exponential Backoff: Handles 429 errors gracefully without spamming the API.
  • Clean Data: Uses pd.concat and drop_duplicates to avoid overlapping entries.

内容的提问来源于stack exchange,提问作者John

火山引擎 最新活动