You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

如何使用Python标准库实现多字典写入CSV及CSV分组聚合结果写入CSV

我来分两部分帮你解决这两个用Python标准库操作CSV的问题:

问题1:将多个字典写入CSV文件

使用Python标准库中的csv.DictWriter是处理这类需求最便捷的方式,它专门为字典到CSV的转换设计,下面分两种常见场景说明:

场景1:所有字典结构一致(键完全相同)

如果你的字典列表有统一的键(对应CSV的表头),可以直接提取第一个字典的键作为表头,然后批量写入:

import csv

# 示例字典列表
dict_list = [
    {"name": "Alice", "age": 30, "city": "New York"},
    {"name": "Bob", "age": 25, "city": "London"},
    {"name": "Charlie", "age": 35, "city": "Paris"}
]

# 用with语句管理文件,自动处理打开/关闭
with open("output.csv", "w", newline="") as f:
    # 指定表头为字典的键
    writer = csv.DictWriter(f, fieldnames=dict_list[0].keys())
    writer.writeheader()  # 写入表头行
    writer.writerows(dict_list)  # 批量写入所有字典

场景2:字典结构不一致(键存在差异)

如果字典的键不统一,需要先收集所有唯一的键作为表头,避免丢失数据:

import csv

dict_list = [
    {"name": "Alice", "age": 30},
    {"name": "Bob", "city": "London"},
    {"name": "Charlie", "age": 35, "city": "Paris"}
]

# 收集所有字典的键,去重后排序保证表头顺序稳定
all_fields = set()
for d in dict_list:
    all_fields.update(d.keys())
all_fields = sorted(all_fields)

with open("output.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=all_fields)
    writer.writeheader()
    writer.writerows(dict_list)
问题2:将分组聚合结果写入CSV

你已经完成了分组求和的核心逻辑,现在只需要把四个聚合字典的数据整合,按要求格式写入CSV即可。我提供两种实现方式,第一种是基于你现有代码的扩展,第二种是更简洁的优化版本:

方式1:基于你现有代码的扩展

直接利用已有的agg1-agg4字典,遍历键并拆分元组,写入每行数据:

import csv

# 你的现有代码(建议用with语句管理文件,更安全)
agg1 = {}
agg2 = {}
agg3 = {}
agg4 = {}
with open("my_file.csv", "r") as f:
    reader = csv.reader(f)
    next(reader)  # 跳过表头,比判断row[0]更可靠
    for row in reader:
        key = (row[0], row[2])
        agg1[key] = agg1.setdefault(key, 0) + float(row[4])
        agg2[key] = agg2.setdefault(key, 0) + float(row[5])
        agg3[key] = agg3.setdefault(key, 0) + float(row[6])
        agg4[key] = agg4.setdefault(key, 0) + float(row[7])

# 写入聚合结果
output_header = ["item1", "item3", "agg1", "agg2", "agg3", "agg4"]
with open("aggregated_output.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(output_header)  # 写入表头
    
    # 遍历所有分组键(用agg1的键即可,四个字典的键完全一致)
    for key in agg1.keys():
        item1_val, item3_val = key
        # 取出对应聚合值,若需要整数可转int(agg1_val)
        row_data = [item1_val, item3_val, agg1[key], agg2[key], agg3[key], agg4[key]]
        writer.writerow(row_data)

方式2:优化版本(合并聚合逻辑)

可以把四个聚合字典合并成一个,每个键对应一个包含四个求和值的列表,让代码更紧凑:

import csv

# 用单个字典存储所有聚合结果,键为(item1, item3),值为[v1_sum, v2_sum, v3_sum, v4_sum]
aggregations = {}
with open("my_file.csv", "r") as f:
    reader = csv.reader(f)
    next(reader)
    for row in reader:
        key = (row[0], row[2])
        # 把v1-v4转为浮点数列表
        current_values = list(map(float, row[4:8]))
        if key not in aggregations:
            aggregations[key] = current_values
        else:
            # 对应位置求和
            aggregations[key] = [a + b for a, b in zip(aggregations[key], current_values)]

# 写入CSV
output_header = ["item1", "item3", "agg1", "agg2", "agg3", "agg4"]
with open("aggregated_output.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(output_header)
    for (item1_val, item3_val), agg_vals in aggregations.items():
        writer.writerow([item1_val, item3_val] + agg_vals)

两种方式都能生成你需要的预期输出,优化版本减少了冗余代码,更易维护。

内容的提问来源于stack exchange,提问作者Omega

火山引擎 最新活动