You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

在Jupyter Notebook中导入多CSV文件,提取Market Cap列并重命名为文件名

Solution for Merging Cryptocurrency Market Cap Data in Jupyter Notebook

Let's break down exactly how to import, clean, and merge those 100 CSV files into a single, usable DataFrame. Here's a step-by-step approach with code:

Step 1: Prepare Your Environment & File Setup

First, make sure you've downloaded all the CSV files from the dataset and saved them in a dedicated folder (e.g., crypto_market_data) on your local machine. Then, install/import the required libraries:

# Install pandas if you haven't already (run once)
!pip install pandas

# Import necessary libraries
import pandas as pd
import os

Step 2: Define Your Data Folder Path

Set the path to the folder where you stored the CSV files. For example:

# Replace this with your actual folder path
data_folder = "./crypto_market_data"

Step 3: Loop Through Files & Build Merged DataFrame

We'll iterate over each CSV, extract only the Market Cap column (with timestamp as index), rename the column to the crypto's filename, and merge everything together:

# Initialize an empty DataFrame to hold all merged data
merged_market_cap = pd.DataFrame()

# Loop through each file in the data folder
for filename in os.listdir(data_folder):
    # Only process CSV files
    if filename.endswith(".csv"):
        # Full path to the current file
        file_path = os.path.join(data_folder, filename)
        
        # Read the CSV, set the timestamp column as index (adjust 'Date' if your timestamp column has a different name)
        df = pd.read_csv(file_path, index_col="Date", parse_dates=True)
        
        # Keep only the 'Market Cap' column
        df = df[["Market Cap"]]
        
        # Rename the column to the crypto name (remove .csv from filename)
        crypto_name = filename.replace(".csv", "")
        df.rename(columns={"Market Cap": crypto_name}, inplace=True)
        
        # Merge with the main DataFrame (use outer join to preserve all timestamps across cryptos)
        if merged_market_cap.empty:
            merged_market_cap = df
        else:
            merged_market_cap = merged_market_cap.join(df, how="outer")

Step 4: (Optional) Clean Up Missing Values

Depending on the data, you might have missing entries where a crypto didn't exist on a particular date. You can handle this with methods like forward fill or drop missing values:

# Example: Forward fill missing values (carry over the last known market cap)
merged_market_cap.fillna(method="ffill", inplace=True)

# Or drop rows with any missing values (use cautiously, as this removes data)
# merged_market_cap.dropna(inplace=True)

Key Notes & Troubleshooting

  • Timestamp Column Name: If your CSV files use a different name for the timestamp column (e.g., "Timestamp" instead of "Date"), update the index_col parameter in pd.read_csv() accordingly.
  • Filename Cleanup: If filenames have extra characters (like hyphens or spaces), you can adjust the crypto_name line to clean them up (e.g., crypto_name = filename.replace(".csv", "").replace("-", "_")).
  • Memory Considerations: 100 files might take up some memory—if you run into issues, consider processing batches or filtering dates before merging.

内容的提问来源于stack exchange,提问作者Juan Rodriguez

火山引擎 最新活动