Python基于技术指标筛选股票遇阻:代码报错及逻辑问题求助
问题分析与解决办法
首先,你遇到的ValueError: invalid literal for int() with base 10: 'Revenue'错误非常明确——你写了int('Revenue'),这是在尝试把字符串'Revenue'转换成整数,这显然是行不通的,你真正要做的是获取technicals字典里'Revenue'对应的数值,再进行判断。
另外,Yahoo Finance返回的营收值是带后缀的字符串(比如"383.31B"代表3833.1亿美元),不能直接转成整数,需要先解析这个后缀(B=十亿,M=百万,K=千),再转换成对应的数值,才能和1000亿美元(即100*10^9)做比较。
完整修复后的代码
import urllib2 from bs4 import BeautifulSoup import time def parse_financial_value(value_str): """解析带后缀的财务数值字符串(如383.31B、12.5M)为数值""" if not value_str: return 0 # 去掉千分位逗号 cleaned_str = value_str.replace(',', '') try: if 'B' in cleaned_str: return float(cleaned_str.replace('B', '')) * 10**9 elif 'M' in cleaned_str: return float(cleaned_str.replace('M', '')) * 10**6 elif 'K' in cleaned_str: return float(cleaned_str.replace('K', '')) * 10**3 else: return float(cleaned_str) except ValueError: # 解析失败时返回0,避免程序崩溃 return 0 def scrape_yahoo(stock): technicals = {} try: url = ('http://finance.yahoo.com/q/ks?s='+stock) page = urllib2.urlopen(url) soup = BeautifulSoup(page, 'html.parser') tables = soup.findAll('table', {"class" : 'table-qsp-stats'}) for table in tables: table_body = table.find('tbody') rows = table_body.find_all('tr') for row in rows: col_name = row.find_all('span') col_name = [cell.text.strip() for cell in col_name] col_val = row.find_all('td') col_val = [cell.text.strip() for cell in col_val] if col_name and len(col_val)>=2: technicals[col_name[0]] = col_val[1] return technicals except Exception as e: print('Failed to scrape {}: {}'.format(stock, str(e))) return technicals def scrape(stock_list, interested): # 保存所有符合条件的股票数据,而不是只返回最后一个 all_technicals = {} for each_stock in stock_list: technicals = scrape_yahoo(each_stock) # 获取营收值并解析 revenue = parse_financial_value(technicals.get('Revenue', '0')) # 判断是否超过1000亿美元(1000*10^9) if revenue > 1000 * 10**9: print(each_stock) for ind in interested: print("{}: {}".format(ind, technicals.get(ind, 'N/A'))) print("------") all_technicals[each_stock] = technicals time.sleep(1) return all_technicals def main(): stock_list = ['aapl', 'tsla', 'ge'] interested = ['Market Cap (intraday)', 'Return on Equity', 'Revenue', 'Quarterly Revenue Growth'] tech = scrape(stock_list, interested) print("符合条件的股票数据:", tech) if __name__ == "__main__": main()
关键修改点说明
修复类型转换错误:
把错误的int('Revenue')改成了通过technicals.get('Revenue', '0')获取实际的营收字符串,再用parse_financial_value函数转换成数值。添加财务数值解析函数:
parse_financial_value专门处理带后缀的财务数据,比如把"383.31B"转换成383310000000.0(即3833.1亿美元),这样就能和1000亿美元的阈值做准确比较。增加异常处理:
- 在解析数值时添加了异常捕获,避免因数据格式异常导致程序崩溃;
- 在抓取数据时,增加了对
col_name和col_val长度的判断,防止索引越界。
优化数据存储:
修改了scrape函数,将所有符合条件的股票数据存入all_technicals字典返回,而不是只返回最后一个股票的数据,更符合实际需求。
内容的提问来源于stack exchange,提问作者alskdjf




