请求协助提取Steam社区市场页面中[[]]包裹的数据
[[]] from Steam Community Market Pages Hey there! Let's tackle this problem of pulling out the data wrapped in [[]] from the Steam Market page source you've fetched. Your current code does a great job of grabbing the page content—now we just need to zero in on those specific [[]] blocks instead of printing every line.
Basic Solution: Extract All [[]] Matches
The simplest way to get every instance of data wrapped in [[]] is using Python's re (regular expressions) module. Here's how to modify your code:
import requests import re sites = [ "https://steamcommunity.com/market/listings/730/AK-47%20%7C%20Redline%20%28Field-Tested%29" ] for url in sites: r = requests.get(url) page_source = r.text # Regex pattern to match anything inside [[ ]] (non-greedily) # We escape [ and ] with \ since they're special regex characters all_matches = re.findall(r'\[\[(.*?)\]\]', page_source) print(f"URL: {url}") print("\nExtracted data wrapped in [[]]:") for match in all_matches: print("-", match)
How This Works:
re.findall()scans the entire page source and returns every string that fits the pattern.- The pattern
r'\[\[(.*?)\]\]'breaks down like this:\[\[: Matches the literal[[characters(.*?): A non-greedy capture group that grabs all characters until the first]](this avoids merging multiple[[]]blocks into one big string)\]\]: Matches the literal]]characters
Targeted Solution: Extract Historical Transaction Data
If you're specifically after the historical price/transaction data (which is stored in a specific [[]] block on Steam Market pages), you can narrow your search to the relevant script tag. Steam stores this data in a variable like line1=[[...]], so we can target that directly:
import requests import re sites = [ "https://steamcommunity.com/market/listings/730/AK-47%20%7C%20Redline%20%28Field-Tested%29" ] for url in sites: r = requests.get(url) page_source = r.text # Search for the specific line1 variable containing historical data # re.DOTALL allows the . to match newlines, since the data spans multiple lines historical_match = re.search(r'var line1=\[\[(.*?)\]\];', page_source, re.DOTALL) print(f"URL: {url}") if historical_match: print("\nExtracted historical transaction data:") print(historical_match.group(1)) else: print("\nNo historical transaction data found in [[]] format.")
This will give you exactly the historical price data you're likely looking for, without pulling in any unrelated [[]] blocks from other parts of the page.
内容的提问来源于stack exchange,提问作者Emre Özel




