BeautifulSoup提取td元素与表格数据问题求助
Hey there! Let's work through your two questions and get that TAIR value extracted properly.
1. Why does soup.find() only return the first <td class="td11"> element?
The find() method in BeautifulSoup only returns the first matching element it finds in the HTML structure. That's why you're only seeing the first td.td11 (with psbA inside) instead of all matching elements. To get every td.td11 on the page, you need to use find_all() instead, which returns a list of all matches.
2. Why does AGI = AGI.table return None?
When you run AGI = soup.find("td", {"class":"td11"}), you're grabbing the first td.td11 element. If you inspect that element's HTML, it doesn't contain a <table> tag—hence why calling .table on it returns None.
Your target TAIR value lives inside the third td.td11 element, and that specific td does contain a table. So you need to first target that third element, then access its table content.
Modified Working Code
Here's the adjusted code to extract the TAIR value correctly:
import requests from bs4 import BeautifulSoup as BS my_protein_list = ["ArthCp002"] for protein in my_protein_list: response = requests.get('https://www.genome.jp/dbget-bin/www_bget?ath:' + protein) response.raise_for_status() # Raise error if request fails soup = BS(response.text,'html.parser') # Get all td elements with class td11 all_td11 = soup.find_all("td", {"class":"td11"}) if len(all_td11) >= 3: # Target the third td (index 2, since Python uses 0-based indexing) target_td = all_td11[2] # Find the table inside this td tair_table = target_td.find("table") if tair_table: # Extract the TAIR value (it's in the second row's second column) tair_value = tair_table.find_all("tr")[1].find_all("td")[1].text.strip() print(f"TAIR Value: {tair_value}") else: print("Table containing TAIR value not found in the third td.") else: print("Not enough td.td11 elements found on the page.")
Key Changes Explained:
- Used
find_all()to get alltd.td11elements instead of just the first. - Targeted the third element in the list (index 2) where the TAIR data resides.
- Accessed the table inside that specific td, then navigated to the row/column containing the TAIR value.
- Added error checking to handle cases where elements might be missing (prevents index errors).
内容的提问来源于stack exchange,提问作者FlexingWater




