You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

CSV文件未生成至指定文件夹,仅终端输出相关信息的问题求助

CSV文件未生成至指定文件夹,仅终端输出相关信息的问题求助

大家好,我写了一段Python代码用来从维基百科抓取各国GDP数据并生成Top10经济体的CSV文件,但现在遇到个问题:终端能正常输出所有日志(比如成功抓取页面、表格信息、保存提示),但指定路径下就是找不到生成的CSV文件。

以下是我的完整代码:

import requests
import pandas as pd
import numpy as np
import os
print("Saving file to:", os.getcwd())


# Define the URL
URL = "https://web.archive.org/web/20230902185326/https://en.wikipedia.org/wiki/List_of_countries_by_GDP_%28nominal%29"

# Fetch the HTML content of the webpage
response = requests.get(URL)
if response.status_code == 200:
    print("Successfully fetched the page!")
    html_content = response.text
else:
    print(f"Failed to fetch the page. Status code: {response.status_code}")
    exit()

# Extract tables from the webpage
try:
    tables = pd.read_html(html_content)
    if not tables:
        print("No tables found in the HTML content.")
        exit()
except ValueError as e:
    print(f"Error reading HTML tables: {e}")
    exit()

# Inspect all extracted tables
for i, table in enumerate(tables):
    print(f"Table {i}:")
    print(table.head())
    print("\n")

# Select the required table (adjust index if necessary)
df = tables[3]  # Replace 3 with the correct index if needed
print("Selected table:")
print(df.head())

# Dynamically rename and inspect columns
df.columns = range(df.shape[1])  # Replace headers with numerical indices
print("Columns after renaming:", df.columns)

# Handle missing columns dynamically
if 2 in df.columns:
    df = df[[0, 2]]  # Select columns 0 and 2
else:
    print("Column 2 not found. Available columns:", df.columns)
    exit()

# Retain rows for the top 10 economies
df = df.iloc[1:11, :]

# Rename columns
df.columns = ['Country', 'GDP (Million USD)']

# Convert GDP from Million USD to Billion USD and round to 2 decimal places
df['GDP (Million USD)'] = df['GDP (Million USD)'].astype(float)
df['GDP (Million USD)'] = np.round(df['GDP (Million USD)'] / 1000, 2)

# Rename the column header to 'GDP (Billion USD)'
df.rename(columns={'GDP (Million USD)': 'GDP (Billion USD)'}, inplace=True)

# Save the DataFrame to a CSV file
df.to_csv(r"C:\Users\Path\Largest_economies.csv", index=False)

print("The top 10 economies by GDP have been saved to 'Largest_economies.csv'.")

我已经尝试了以下排查步骤,但还是没解决问题:

  • os.getcwd()验证了当前工作目录
  • 显式在to_csv()方法中指定了完整文件路径
  • 尝试将文件保存到桌面这类更简单的路径,测试写入权限
  • to_csv()套了try-except块来捕获可能的错误
  • 确保目标文件夹已存在,且编码设置正确

有没有朋友能帮我看看问题出在哪?

备注:内容来源于stack exchange,提问作者Dion H

火山引擎 最新活动