使用Python登录网站遇Response [404]错误,新手求助排查方案
Hey there! Let's dig into why your login script is returning a 404 error. I've gone through your code and checked the target website, and here are the key issues and fixes to get you up and running:
1. You're POSTing to the Wrong URL
This is almost certainly the root cause of your 404. When you submit a login form in the browser, it doesn't always send data to the same URL you're viewing. To find the correct endpoint:
- Open your browser's DevTools (F12)
- Switch to the Network tab
- Enter your credentials and click "Anmelden"
- Look for the POST request in the list—its URL is where you need to send your form data, not the login page URL itself.
For this specific site, the login form's action attribute points to a different path (likely something like /fruchtgenuss/login_check), not /fruchtgenuss/login. Your script is sending data to the login page itself, which doesn't handle POST requests, hence the 404.
2. Fix Your Credential Formatting
You're using HTML entities (<username>) for your credentials—these need to be replaced with your actual username and password as plain text. For example:
form["nick"] = "my_actual_username" form["password"] = "my_actual_password"
3. Adjust Headers for Better Browser Simulation
While your User-Agent is good, adding a Referer header tells the server you came from the login page, which helps mimic real browser behavior. You can also let requests handle the Content-Type header automatically (it will set it to application/x-www-form-urlencoded for form data).
Updated Working Script
Here's a revised version of your script that addresses all these issues:
import requests from lxml import html # Initialize session to persist cookies across requests session_requests = requests.session() login_url = "https://app.foodcoops.at/fruchtgenuss/login" # Mimic real browser headers headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36', 'Referer': login_url } # Fetch login page to extract form details and CSRF tokens result = session_requests.get(login_url, headers=headers) tree = html.fromstring(result.content) # Use content instead of text for proper encoding handling # Extract hidden form fields (critical for validating the request) hidden_inputs = tree.xpath('//form//input[@type="hidden"]') form_data = {input.attrib["name"]: input.attrib["value"] for input in hidden_inputs} # Add your actual login credentials form_data["nick"] = "your_actual_username" form_data["password"] = "your_actual_password" form_data["commit"] = "Anmelden" # Get the correct form submission URL from the form's action attribute form_action = tree.xpath('//form[@id="login-form"]/@action')[0] # Handle relative URLs by combining with the base site URL if not form_action.startswith('http'): form_action = f"https://app.foodcoops.at{form_action}" # Submit the form to the correct endpoint login_response = session_requests.post(form_action, data=form_data, headers=headers) # Verify the result print(f"Response Status Code: {login_response.status_code}") print("\nFirst 500 characters of response:") print(login_response.text[:500])
Key Notes:
- Session Persistence: Using
requests.session()ensures cookies (like the session cookie after successful login) are retained for subsequent requests to the site. - Dynamic Form Action: By extracting the form's
actionattribute, you avoid hardcoding the wrong URL and adapt to any future changes the site might make. - Encoding: Using
result.contentinstead ofresult.textensures you handle any character encoding correctly when parsing the HTML.
If you still run into issues, check the DevTools Network tab again to compare your script's request headers and form data against what the browser sends—sometimes sites require additional headers or have hidden fields you might have missed.
内容的提问来源于stack exchange,提问作者Gozy4




