Python3.8+Brotli1.0.9解码br压缩内容时BrotliDecompress失败求助
Hey there, let's dig into why your Brotli decompression is throwing that error. I've run into similar issues before, so here's what's going on and how to fix it:
最可能的原因:Requests自动帮你处理了压缩解压
The big gotcha here is that Requests library automatically handles decompression by default. When you set Accept-Encoding: gzip, deflate, br in your headers and the server returns Brotli-compressed content, Requests quietly unpacks it for you. That means response.content is already the uncompressed raw byte data—not the Brotli-compressed bytes you're expecting. Trying to run brotli.decompress() on already-unpacked data will definitely throw that BrotliDecompress failed error.
Here's a simplified, fixed version of your code that leverages Requests' built-in handling:
import requests headers = { 'Accept': "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9", 'Accept-Encoding': "gzip, deflate, br", 'Host': "book.douban.com", 'Referer': "https://book.douban.com/", # Fixed: Use full URL instead of partial domain 'Sec-Fetch-Dest': "document", 'Sec-Fetch-Mode': "navigate", 'Upgrade-Insecure-Requests': "1" } s = requests.Session() url = "https://book.douban.com/tag/%E5%B0%8F%E8%AF%B4" try: response = s.get(url, headers=headers) response.raise_for_status() #主动抛出HTTP errors like 403/500 instead of silently ignoring except Exception as e: print(f"Request failed: {str(e)}") return "" if response.status_code == 200: print(response.headers) # Requests already handled decompression—just use response.text directly return response.text else: return ""
If you really need to handle Brotli manually
If you have a specific reason to unpack the content yourself (like debugging), you need to disable Requests' automatic decompression. Do this by enabling stream=True and turning off decode_content on the raw response:
import brotli import requests headers = { 'Accept': "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9", 'Accept-Encoding': "gzip, deflate, br", 'Host': "book.douban.com", 'Referer': "https://book.douban.com/", 'Sec-Fetch-Dest': "document", 'Sec-Fetch-Mode': "navigate", 'Upgrade-Insecure-Requests': "1" } s = requests.Session() url = "https://book.douban.com/tag/%E5%B0%8F%E8%AF%B4" try: # Enable stream mode to access raw compressed content response = s.get(url, headers=headers, stream=True) response.raw.decode_content = False # Disable automatic decompression response.raise_for_status() except Exception as e: print(f"Request failed: {str(e)}") return "" if response.status_code == 200: print(response.headers) if response.headers.get('Content-Encoding') == 'br': # Read raw compressed bytes and decompress manually raw_compressed_data = response.raw.read() uncompressed_data = brotli.decompress(raw_compressed_data) return uncompressed_data.decode('utf-8') else: return response.text else: return ""
Extra tips to avoid future issues
- Upgrade your Brotli version: Brotli 1.0.9 is pretty old (released in 2019). Newer versions fix compatibility issues with modern Brotli compression formats. Run this command to upgrade:
pip install --upgrade brotli - Fix your Referer header: Your original code used
book.douban.cominstead of the full URLhttps://book.douban.com/—this might trigger anti-scraping measures from Douban. - Avoid early returns in try blocks: Your original code had a
return ""inside the try block that skipped all the status code checking logic. Useraiseorprintto handle errors instead of bailing early.
内容的提问来源于stack exchange,提问作者xin.chen




