在Jupyter Notebook中导入多CSV文件，提取Market Cap列并重命名为文件名

阿华AIGC实验室

2026-5-19

Solution for Merging Cryptocurrency Market Cap Data in Jupyter Notebook

Let's break down exactly how to import, clean, and merge those 100 CSV files into a single, usable DataFrame. Here's a step-by-step approach with code:

Step 1: Prepare Your Environment & File Setup

First, make sure you've downloaded all the CSV files from the dataset and saved them in a dedicated folder (e.g., crypto_market_data) on your local machine. Then, install/import the required libraries:

# Install pandas if you haven't already (run once)
!pip install pandas

# Import necessary libraries
import pandas as pd
import os

Step 2: Define Your Data Folder Path

Set the path to the folder where you stored the CSV files. For example:

# Replace this with your actual folder path
data_folder = "./crypto_market_data"

Step 3: Loop Through Files & Build Merged DataFrame

We'll iterate over each CSV, extract only the Market Cap column (with timestamp as index), rename the column to the crypto's filename, and merge everything together:

# Initialize an empty DataFrame to hold all merged data
merged_market_cap = pd.DataFrame()

# Loop through each file in the data folder
for filename in os.listdir(data_folder):
    # Only process CSV files
    if filename.endswith(".csv"):
        # Full path to the current file
        file_path = os.path.join(data_folder, filename)
        
        # Read the CSV, set the timestamp column as index (adjust 'Date' if your timestamp column has a different name)
        df = pd.read_csv(file_path, index_col="Date", parse_dates=True)
        
        # Keep only the 'Market Cap' column
        df = df[["Market Cap"]]
        
        # Rename the column to the crypto name (remove .csv from filename)
        crypto_name = filename.replace(".csv", "")
        df.rename(columns={"Market Cap": crypto_name}, inplace=True)
        
        # Merge with the main DataFrame (use outer join to preserve all timestamps across cryptos)
        if merged_market_cap.empty:
            merged_market_cap = df
        else:
            merged_market_cap = merged_market_cap.join(df, how="outer")

Step 4: (Optional) Clean Up Missing Values

Depending on the data, you might have missing entries where a crypto didn't exist on a particular date. You can handle this with methods like forward fill or drop missing values:

# Example: Forward fill missing values (carry over the last known market cap)
merged_market_cap.fillna(method="ffill", inplace=True)

# Or drop rows with any missing values (use cautiously, as this removes data)
# merged_market_cap.dropna(inplace=True)

Key Notes & Troubleshooting

Timestamp Column Name: If your CSV files use a different name for the timestamp column (e.g., "Timestamp" instead of "Date"), update the index_col parameter in pd.read_csv() accordingly.
Filename Cleanup: If filenames have extra characters (like hyphens or spaces), you can adjust the crypto_name line to clean them up (e.g., crypto_name = filename.replace(".csv", "").replace("-", "_")).
Memory Considerations: 100 files might take up some memory—if you run into issues, consider processing batches or filtering dates before merging.

内容的提问来源于stack exchange，提问作者Juan Rodriguez