使用BeautifulSoup爬取Plancke.io表格返回异常结果的技术求助
Fixing Your BedWars Stats Scraper
Let's break down what went wrong and fix your code step by step.
The Core Issue
Your code has two key mistakes that led to the unexpected output:
- Variable Misnaming & Logic Mix-Up: You used
rowto iterate oversoup.find_all(class_="table"), but that call returns table elements, not table rows (<tr>). This meant you were trying to extract cells directly from tables instead of rows within the correct table. - Unfiltered Extraction: You tried to pull cells from all tables at once without targeting the specific rows that contain Kills, Deaths, and K/D data.
Corrected Code
Here's a revised version that accurately grabs the stats you need:
import requests from bs4 import BeautifulSoup username = "gamerboy80" mode = "BedWars" # Fetch and parse the page r = requests.get(f'https://plancke.io/hypixel/player/stats/{username}#{mode}') soup = BeautifulSoup(r.content, 'html.parser') # Get the second table (this holds the BedWars core stats) bedwars_stats_table = soup.find_all(class_="table")[1] # Extract the specific stats we want target_stats = {} for row in bedwars_stats_table.find_all('tr'): cells = row.find_all('td') if len(cells) >= 2: stat_label = cells[0].get_text(strip=True) stat_value = cells[1].get_text(strip=True) # Only keep the stats we care about if stat_label in ["Kills", "Deaths", "K/D"]: target_stats[stat_label] = stat_value # Format and print the result result = [target_stats["Kills"], target_stats["Deaths"], target_stats["K/D"]] print(result)
How This Works
- Target the Correct Table: We explicitly grab the second table (
[1]index) which contains the BedWars core performance stats. - Iterate Over Rows: We loop through each row (
<tr>) in this table to find the relevant stats. - Filter for Specific Stats: We check each row's first cell for the labels "Kills", "Deaths", and "K/D", then store their corresponding values from the second cell.
Alternative: Direct Row Targeting
If you prefer a more concise approach, you can directly locate each row by its text:
# Get the second table bedwars_stats_table = soup.find_all(class_="table")[1] # Find each target row and extract the value kills = bedwars_stats_table.find('tr', string=lambda t: t and 'Kills' in t.strip()).find_all('td')[1].get_text(strip=True) deaths = bedwars_stats_table.find('tr', string=lambda t: t and 'Deaths' in t.strip()).find_all('td')[1].get_text(strip=True) kd = bedwars_stats_table.find('tr', string=lambda t: t and 'K/D' in t.strip()).find_all('td')[1].get_text(strip=True) print([kills, deaths, kd])
Either approach will output your desired result: ['5,437', '4,671', '1.16']
内容的提问来源于stack exchange,提问作者user15966297




