You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

如何解析Python requests.response对象并提取CookieJar等属性的数据?

How to Parse a Python requests.Response Object and Extract Data

Hey there! Let's clear up your confusion around the requests.Response object, especially when it comes to accessing cookies and other properties.

First off, the reason print(raw_object.CookieJar) gave you <requests.response.CookieJar> is because CookieJar is the type of the cookie attribute, not the attribute name itself. The actual attribute holding the cookies is cookies, and it's an instance of requests.cookies.RequestsCookieJar (a subclass of Python's built-in http.cookiejar.CookieJar).

To extract the data from it, you have a few straightforward options:

  1. Convert it to a dictionary: This gives you a simple key-value map of cookie names to values.
    import requests
    
    url = "your-target-url-here"
    raw_object = requests.get(url)
    
    # Convert cookies to a dictionary
    cookie_dict = dict(raw_object.cookies)
    print(cookie_dict)
    
  2. Iterate over individual cookies: If you need more details (like domain, expiration date, etc.), loop through the CookieJar items:
    for cookie in raw_object.cookies:
        print(f"Name: {cookie.name}, Value: {cookie.value}, Domain: {cookie.domain}")
    

Extracting Other Response Properties

Most other properties of the Response object are directly accessible, and the approach varies slightly based on what type of data they hold:

  • Text content: You already know raw_object.text for the decoded string representation of the response body.
  • Binary content: Use raw_object.content if you need the raw bytes (useful for images, files, etc.).
  • JSON data: If the response returns JSON, call raw_object.json() to parse it into a Python dictionary/list. Just make sure the response's Content-Type header indicates JSON first, otherwise this will throw an error.
    try:
        json_data = raw_object.json()
        print(json_data)
    except ValueError:
        print("Response is not in JSON format")
    
  • Response headers: raw_object.headers is a dictionary-like object. You can access individual headers by key or convert it to a full dictionary:
    content_type = raw_object.headers.get("Content-Type")
    headers_dict = dict(raw_object.headers)
    
  • Status code: raw_object.status_code gives you the HTTP status code (like 200 for OK, 404 for Not Found).

Tips for Exploring Response Properties

Since you already used dir(raw_object) to list attributes, focus on the ones without leading/trailing double underscores (those are "magic" methods/attributes used internally). For any attribute you're unsure about, you can use help() to get more details:

help(raw_object.cookies)

内容的提问来源于stack exchange,提问作者MickeyLater

火山引擎 最新活动