如何使用Python标准库实现多字典写入CSV及CSV分组聚合结果写入CSV
我来分两部分帮你解决这两个用Python标准库操作CSV的问题:
问题1:将多个字典写入CSV文件
使用Python标准库中的csv.DictWriter是处理这类需求最便捷的方式,它专门为字典到CSV的转换设计,下面分两种常见场景说明:
场景1:所有字典结构一致(键完全相同)
如果你的字典列表有统一的键(对应CSV的表头),可以直接提取第一个字典的键作为表头,然后批量写入:
import csv # 示例字典列表 dict_list = [ {"name": "Alice", "age": 30, "city": "New York"}, {"name": "Bob", "age": 25, "city": "London"}, {"name": "Charlie", "age": 35, "city": "Paris"} ] # 用with语句管理文件,自动处理打开/关闭 with open("output.csv", "w", newline="") as f: # 指定表头为字典的键 writer = csv.DictWriter(f, fieldnames=dict_list[0].keys()) writer.writeheader() # 写入表头行 writer.writerows(dict_list) # 批量写入所有字典
场景2:字典结构不一致(键存在差异)
如果字典的键不统一,需要先收集所有唯一的键作为表头,避免丢失数据:
import csv dict_list = [ {"name": "Alice", "age": 30}, {"name": "Bob", "city": "London"}, {"name": "Charlie", "age": 35, "city": "Paris"} ] # 收集所有字典的键,去重后排序保证表头顺序稳定 all_fields = set() for d in dict_list: all_fields.update(d.keys()) all_fields = sorted(all_fields) with open("output.csv", "w", newline="") as f: writer = csv.DictWriter(f, fieldnames=all_fields) writer.writeheader() writer.writerows(dict_list)
问题2:将分组聚合结果写入CSV
你已经完成了分组求和的核心逻辑,现在只需要把四个聚合字典的数据整合,按要求格式写入CSV即可。我提供两种实现方式,第一种是基于你现有代码的扩展,第二种是更简洁的优化版本:
方式1:基于你现有代码的扩展
直接利用已有的agg1-agg4字典,遍历键并拆分元组,写入每行数据:
import csv # 你的现有代码(建议用with语句管理文件,更安全) agg1 = {} agg2 = {} agg3 = {} agg4 = {} with open("my_file.csv", "r") as f: reader = csv.reader(f) next(reader) # 跳过表头,比判断row[0]更可靠 for row in reader: key = (row[0], row[2]) agg1[key] = agg1.setdefault(key, 0) + float(row[4]) agg2[key] = agg2.setdefault(key, 0) + float(row[5]) agg3[key] = agg3.setdefault(key, 0) + float(row[6]) agg4[key] = agg4.setdefault(key, 0) + float(row[7]) # 写入聚合结果 output_header = ["item1", "item3", "agg1", "agg2", "agg3", "agg4"] with open("aggregated_output.csv", "w", newline="") as f: writer = csv.writer(f) writer.writerow(output_header) # 写入表头 # 遍历所有分组键(用agg1的键即可,四个字典的键完全一致) for key in agg1.keys(): item1_val, item3_val = key # 取出对应聚合值,若需要整数可转int(agg1_val) row_data = [item1_val, item3_val, agg1[key], agg2[key], agg3[key], agg4[key]] writer.writerow(row_data)
方式2:优化版本(合并聚合逻辑)
可以把四个聚合字典合并成一个,每个键对应一个包含四个求和值的列表,让代码更紧凑:
import csv # 用单个字典存储所有聚合结果,键为(item1, item3),值为[v1_sum, v2_sum, v3_sum, v4_sum] aggregations = {} with open("my_file.csv", "r") as f: reader = csv.reader(f) next(reader) for row in reader: key = (row[0], row[2]) # 把v1-v4转为浮点数列表 current_values = list(map(float, row[4:8])) if key not in aggregations: aggregations[key] = current_values else: # 对应位置求和 aggregations[key] = [a + b for a, b in zip(aggregations[key], current_values)] # 写入CSV output_header = ["item1", "item3", "agg1", "agg2", "agg3", "agg4"] with open("aggregated_output.csv", "w", newline="") as f: writer = csv.writer(f) writer.writerow(output_header) for (item1_val, item3_val), agg_vals in aggregations.items(): writer.writerow([item1_val, item3_val] + agg_vals)
两种方式都能生成你需要的预期输出,优化版本减少了冗余代码,更易维护。
内容的提问来源于stack exchange,提问作者Omega




