Python3.6下载图片遇urlretrieve 503及403错误求助

阿华AIGC实验室

2026-5-19

Fixing 403/503 Errors When Downloading Images with urllib/requests in Python 3.6

Hey fellow developer! I see you're hitting 403 Forbidden and 503 Service Unavailable errors while trying to download that image from http://pic.minitoon.net/albums/2819/01-01/01_000.jpg using urllib (and even after switching to requests). Let's break down why this is happening and how to fix it—these issues are super common with anti-scraping measures, so we've got this.

Why You're Seeing These Errors

403 Forbidden: Most likely, the server is detecting your request as coming from a script (not a real browser) and blocking it. Servers often check for missing User-Agent headers or non-human request patterns to flag traffic.
503 Service Unavailable: This could be temporary server overload, but more often, it's the server throttling repeated suspicious requests. Even if you fixed the 403, your request pattern might still trigger rate limits.

Fix 1: Improve urllib Requests with Proper Headers & Retries

Let's update your urllib code to mimic a browser and add retry logic for 503 errors. Here's a working example for Python 3.6:

import urllib.request
from urllib.error import HTTPError
import time

def download_image_with_urllib(url, save_path, retries=3):
    # Mimic a real browser's request headers
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
        'Accept': 'image/webp,image/apng,image/*,*/*;q=0.8',
        'Accept-Language': 'en-US,en;q=0.9',
        'Referer': 'http://pic.minitoon.net/'  # Add the site's base URL as referer
    }
    
    req = urllib.request.Request(url, headers=headers)
    
    for attempt in range(retries):
        try:
            with urllib.request.urlopen(req) as response:
                with open(save_path, 'wb') as f:
                    f.write(response.read())
                print(f"Image downloaded successfully to {save_path}")
                return
        except HTTPError as e:
            if e.code == 503 and attempt < retries - 1:
                print(f"503 Error encountered. Retrying in 2 seconds... (Attempt {attempt+1}/{retries})")
                time.sleep(2)
            else:
                print(f"Failed to download image: {e}")
                raise

# Usage
download_image_with_urllib(
    'http://pic.minitoon.net/albums/2819/01-01/01_000.jpg',
    'downloaded_image.jpg'
)

Fix 2: Use Requests with Session & Retry Adapter (More Robust)

Requests is easier to work with for these scenarios. Let's set up a session with persistent headers and an adapter that automatically retries 503 errors—Python 3.6 fully supports this:

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

def download_image_with_requests(url, save_path):
    # Create a session to persist headers and connections
    session = requests.Session()
    
    # Configure retry logic for 503 errors
    retry_strategy = Retry(
        total=3,
        backoff_factor=1,  # Wait 1, 2, 4 seconds between retries
        status_forcelist=[503],
    )
    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount('http://', adapter)
    session.mount('https://', adapter)
    
    # Browser-like headers
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
        'Accept': 'image/webp,image/apng,image/*,*/*;q=0.8',
        'Accept-Language': 'en-US,en;q=0.9',
        'Referer': 'http://pic.minitoon.net/'
    }
    
    try:
        response = session.get(url, headers=headers, stream=True)
        response.raise_for_status()  # Raise exception for HTTP errors
        
        with open(save_path, 'wb') as f:
            for chunk in response.iter_content(chunk_size=8192):
                f.write(chunk)
        print(f"Image downloaded successfully to {save_path}")
    except requests.exceptions.RequestException as e:
        print(f"Failed to download image: {e}")
        raise

# Usage
download_image_with_requests(
    'http://pic.minitoon.net/albums/2819/01-01/01_000.jpg',
    'downloaded_image.jpg'
)

Additional Tips

Check for Cookies: Some sites require cookies to be set (e.g., after visiting the homepage). Use the session object in requests to first visit http://pic.minitoon.net/ to capture cookies before downloading the image.
Avoid Rate Limiting: If you're downloading multiple images, add small delays between requests (time.sleep(1) or similar) to avoid triggering 503s.
Proxy Servers: If you're still blocked, the site might be IP-blocking you. Using a proxy could help, but that's more advanced—try the above steps first.

内容的提问来源于stack exchange，提问作者Arun