如何解析Python requests.response对象并提取CookieJar等属性的数据?
Hey there! Let's clear up your confusion around the requests.Response object, especially when it comes to accessing cookies and other properties.
Accessing Cookie Data (Fixing the CookieJar Issue)
First off, the reason print(raw_object.CookieJar) gave you <requests.response.CookieJar> is because CookieJar is the type of the cookie attribute, not the attribute name itself. The actual attribute holding the cookies is cookies, and it's an instance of requests.cookies.RequestsCookieJar (a subclass of Python's built-in http.cookiejar.CookieJar).
To extract the data from it, you have a few straightforward options:
- Convert it to a dictionary: This gives you a simple key-value map of cookie names to values.
import requests url = "your-target-url-here" raw_object = requests.get(url) # Convert cookies to a dictionary cookie_dict = dict(raw_object.cookies) print(cookie_dict) - Iterate over individual cookies: If you need more details (like domain, expiration date, etc.), loop through the
CookieJaritems:for cookie in raw_object.cookies: print(f"Name: {cookie.name}, Value: {cookie.value}, Domain: {cookie.domain}")
Extracting Other Response Properties
Most other properties of the Response object are directly accessible, and the approach varies slightly based on what type of data they hold:
- Text content: You already know
raw_object.textfor the decoded string representation of the response body. - Binary content: Use
raw_object.contentif you need the raw bytes (useful for images, files, etc.). - JSON data: If the response returns JSON, call
raw_object.json()to parse it into a Python dictionary/list. Just make sure the response'sContent-Typeheader indicates JSON first, otherwise this will throw an error.try: json_data = raw_object.json() print(json_data) except ValueError: print("Response is not in JSON format") - Response headers:
raw_object.headersis a dictionary-like object. You can access individual headers by key or convert it to a full dictionary:content_type = raw_object.headers.get("Content-Type") headers_dict = dict(raw_object.headers) - Status code:
raw_object.status_codegives you the HTTP status code (like 200 for OK, 404 for Not Found).
Tips for Exploring Response Properties
Since you already used dir(raw_object) to list attributes, focus on the ones without leading/trailing double underscores (those are "magic" methods/attributes used internally). For any attribute you're unsure about, you can use help() to get more details:
help(raw_object.cookies)
内容的提问来源于stack exchange,提问作者MickeyLater




