使用Python requests模块自动化循环调用RePEc API的需求

阿华AIGC实验室

2026-5-19

Got it, let's walk through how to build this RePEc API data retrieval workflow properly. I'll use Python here since it's straightforward for API calls and file handling—perfect for this use case.

Step 1: Install Required Tools

First, make sure you have the requests library installed (it handles HTTP requests seamlessly). Run this in your terminal:

pip install requests

Step 2: Full Implementation Code

Here's a complete script that follows your requirements to a tee:

import requests
import time

# Replace with your actual RePEc API key
REPEC_API_KEY = "YOUR_REPEC_ACCESS_KEY"
BASE_URL = "https://api.repec.org/call.cgi"

# Generate the JEL codes we need to loop through (E1 to E7)
jel_codes = [f"E{i}" for i in range(1, 8)]

for jel in jel_codes:
    offset = 0
    # Create a dedicated file for each JEL code to keep data organized
    with open(f"repec_results_{jel}.json", "w", encoding="utf-8") as output_file:
        print(f"Starting fetch for JEL code: {jel}")
        while True:
            # Build the API request parameters
            params = {
                "code": REPEC_API_KEY,
                "getrecentjel": jel,
                "offset": offset,
                "format": "json"  # Use JSON for easy parsing and storage
            }

            try:
                # Send the request and handle potential HTTP errors
                response = requests.get(BASE_URL, params=params)
                response.raise_for_status()
                data = response.json()

                # Stop the loop if no more data is returned
                if not data:
                    print(f"No more data for {jel} at offset {offset}. Moving on.")
                    break

                # Write each data entry to the file (one per line for readability)
                for entry in data:
                    output_file.write(f"{entry}\n")
                
                # Increment offset for the next batch
                offset += 25
                # Add a small delay to avoid hitting API rate limits
                time.sleep(1)

            except requests.exceptions.RequestException as e:
                print(f"Error fetching data for {jel} at offset {offset}: {str(e)}")
                # You can choose to continue or break here—adjust based on your needs
                break

print("All data retrieval tasks completed!")

Step 3: Key Details to Keep in Mind

API Key Swap: Don't forget to replace YOUR_REPEC_ACCESS_KEY with your actual valid access key.
Rate Limiting: The time.sleep(1) adds a 1-second pause between requests to respect RePEc's rate limits—check their documentation if you can safely adjust this to be faster.
File Structure: The script creates a separate JSON file for each JEL code (e.g., repec_results_E1.json) to keep your data sorted. If you want a single combined file, modify the file handling to append to one master file instead.
Error Resilience: The script catches common HTTP errors (like network drops or invalid keys) and prints a clear message—you can expand this to log errors to a dedicated file if needed.
Data Format: I used JSON since it's universally supported, but tweak the format parameter in the API call if you need a different output (like XML).

Step 4: Quick Test Before Full Run

Before launching the full loop through all 7 JEL codes, test with just one (e.g., E1) to confirm the API responds correctly and data saves as expected. This helps catch issues with your key or parameters early.

内容的提问来源于stack exchange，提问作者Moses