You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

如何使用Python脚本批量将固定格式TXT文件转换为指定结构的CSV文件

Python Script for Batch TXT to CSV Conversion

Alright, let's tackle this problem head-on. Below is a complete Python script that will batch process your 300 TXT files, extract the required fields from the first 7 lines, format values like Used Time correctly, and output a structured CSV file matching your specified schema.

import re
import csv
from glob import glob

def convert_used_time(time_str):
    """Convert time string like '1:09:308' to total seconds (e.g., 69.308s)"""
    parts = time_str.strip().split(':')
    total_seconds = 0.0
    if len(parts) == 3:
        # Format: minutes:seconds:milliseconds
        minutes = int(parts[0])
        seconds = int(parts[1])
        milliseconds = int(parts[2])
        total_seconds = minutes * 60 + seconds + milliseconds / 1000
    elif len(parts) == 2:
        # Fallback: handle seconds:milliseconds or hours:minutes formats
        try:
            seconds = int(parts[0])
            milliseconds = int(parts[1])
            total_seconds = seconds + milliseconds / 1000
        except ValueError:
            hours = int(parts[0])
            minutes = int(parts[1])
            total_seconds = hours * 3600 + minutes * 60
    # Return formatted string with 3 decimal places
    return f"{total_seconds:.3f}s"

def extract_fields_from_txt(file_path):
    """Extract required fields from the first 7 lines of a TXT file"""
    # Map raw TXT field names to your desired CSV column names
    field_mapping = {
        'Name': 'Name',
        'Score': 'Score',
        'Used Time': 'Time',
        'Software Version': 'Software Ver',
        'Core Version': 'Core Ver',
        'AppID': 'AppID',
        'Key': 'Key',
        'REG Date': 'REG Date',
        'Expiry': 'Expiry',
        'MapName': 'MapName'
    }
    # Initialize empty values for all CSV columns
    extracted = {v: '' for v in field_mapping.values()}
    
    try:
        with open(file_path, 'r', encoding='utf-8') as f:
            # Read only the first 7 lines
            for _ in range(7):
                line = f.readline()
                if not line:
                    break  # Stop early if file has fewer than 7 lines
                # Match all key-value pairs in the current line (handles spaces around colon)
                matches = re.findall(r'(\w+(?:\s+\w+)*)\s*:\s*([^;]+)', line)
                for key, value in matches:
                    key = key.strip()
                    if key in field_mapping:
                        # Apply special formatting for Used Time
                        if key == 'Used Time':
                            extracted[field_mapping[key]] = convert_used_time(value)
                        else:
                            extracted[field_mapping[key]] = value.strip()
        return extracted
    except Exception as e:
        print(f"Error processing {file_path}: {str(e)}")
        return None

def main():
    # Configuration - update this to your TXT files directory
    txt_dir = './'  # Use current directory, or replace with path like 'C:/your_txt_files/'
    output_csv = 'output.csv'
    
    # Get all TXT files in the target directory
    txt_files = glob(f"{txt_dir}*.txt")
    if not txt_files:
        print("No TXT files found in the specified directory.")
        return
    
    # Define your desired CSV header order
    csv_headers = ['Name', 'Score', 'Time', 'Software Ver', 'Core Ver', 'AppID', 'Key', 'REG Date', 'Expiry', 'MapName']
    
    # Write extracted data to CSV
    with open(output_csv, 'w', newline='', encoding='utf-8') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=csv_headers)
        writer.writeheader()
        
        for file in txt_files:
            print(f"Processing {file}...")
            fields = extract_fields_from_txt(file)
            if fields:
                writer.writerow(fields)
    
    print(f"Processing complete! Output saved to {output_csv}")

if __name__ == "__main__":
    main()

Key Details Explained

  • Flexible Field Extraction: Uses a regular expression to match key-value pairs (like Name: 321 or Core Version : 21.0.0.0) across the first 7 lines, handling optional spaces around colons for compatibility with minor formatting variations.
  • Time Conversion Logic: The convert_used_time function turns strings like 1:09:308 into total seconds (e.g., 69.308s) and includes fallback handling for other potential time formats just in case.
  • Batch Processing: Uses glob to automatically find all TXT files in your target directory—no manual file listing required.
  • Error Resilience: Catches and reports file reading errors, so you can review problematic files later without stopping the entire batch job.
  • Strict CSV Structure: Uses csv.DictWriter to ensure the output CSV follows your exact header order, even if fields appear in a different sequence in the source TXT files.

How to Use

  1. Save the script as txt_to_csv.py in the same folder as your TXT files (or update the txt_dir variable to point to your TXT directory).
  2. Run the script with Python:
    python txt_to_csv.py
    
  3. Once processing finishes, you'll find output.csv in the same directory with all your extracted and formatted data.

内容的提问来源于stack exchange,提问作者aiorbits Hans

火山引擎 最新活动