如何将垂直存储的源表数据横向插入至目标表?
如何将垂直存储的记录转换为横向结构插入到目标表?
这是个典型的行转列(Pivot)需求,根据你的数据来源不同,我整理了两种实用的实现方案:
先明确你的输入与期望输出
原始垂直存储数据
FILE ID: 001 RECORD 1 1111 RECORD 2 123456789012345 RECORD 3 A01 11 RECORD 4 A02 11 RECORD 5 A03 11 RECORD 6 0103050 RECORD 7 777 RECORD 8 A01 1 RECORD 9 A02 1 RECORD 10 A03 1111 RECORD 11 A04 11111 FILE ID: 002 RECORD 1 2222 RECORD 2 1234567 RECORD 3 A01 11 RECORD 4 A02 11 RECORD 5 A03 11 RECORD 6 0103050 RECORD 7 777 RECORD 8 A01 1 RECORD 9 A02 1 RECORD 10 A03 1111 RECORD 11 A04 11111 FILE ID: 003 RECORD 1 3333 RECORD 2 1234567 RECORD 3 A01 11 RECORD 4 A02 11 RECORD 5 A03 11 RECORD 6 0103050 RECORD 7 777 RECORD 8 A01 1 RECORD 9 A02 1 RECORD 10 A03 1111 RECORD 11 A04 11111
期望的横向目标表结构
FileID|Record1|Record2|Record3|Record4|Record5|Record6|Record7|Record8|Record9|Record10|Record11 -------------------------------------------------------------------------------------------------- 001 |1111 |123456789012345|A01 11|A02 11|A03 11|0103050|777 |A01 1|A02 1|A03 1111|A04 11111 002 |2222 |1234567|A01 11|A02 11|A03 11|0103050|777 |A01 1|A02 1|A03 1111|A04 11111 003 |3333 |1234567|A01 11|A02 11|A03 11|0103050|777 |A01 1|A02 1|A03 1111|A04 11111
方案一:用Python脚本预处理文本数据并生成SQL插入语句
如果你的原始数据是文本格式(比如txt文件),可以用Python快速解析并生成可直接执行的INSERT语句:
# 读取原始数据(如果是文件可以用open()读取,这里直接用示例文本) raw_data = """FILE ID: 001 RECORD 1 1111 RECORD 2 123456789012345 RECORD 3 A01 11 RECORD 4 A02 11 RECORD 5 A03 11 RECORD 6 0103050 RECORD 7 777 RECORD 8 A01 1 RECORD 9 A02 1 RECORD 10 A03 1111 RECORD 11 A04 11111 FILE ID: 002 RECORD 1 2222 RECORD 2 1234567 RECORD 3 A01 11 RECORD 4 A02 11 RECORD 5 A03 11 RECORD 6 0103050 RECORD 7 777 RECORD 8 A01 1 RECORD 9 A02 1 RECORD 10 A03 1111 RECORD 11 A04 11111 FILE ID: 003 RECORD 1 3333 RECORD 2 1234567 RECORD 3 A01 11 RECORD 4 A02 11 RECORD 5 A03 11 RECORD 6 0103050 RECORD 7 777 RECORD 8 A01 1 RECORD 9 A02 1 RECORD 10 A03 1111 RECORD 11 A04 11111""" # 拆分每条文件记录 file_lines = raw_data.split('\n') for line in file_lines: parts = line.split() # 提取FileID file_id = parts[2] record_map = {} current_idx = 3 # 遍历解析每个RECORD项 while current_idx < len(parts): record_num = parts[current_idx + 1] # 找到下一个RECORD的位置,确定当前RECORD的取值范围 next_record_pos = len(parts) for i in range(current_idx + 2, len(parts)): if parts[i] == 'RECORD': next_record_pos = i break # 拼接当前RECORD的值(处理像"A01 11"这种多部分的情况) record_value = ' '.join(parts[current_idx + 2 : next_record_pos]) record_map[f'Record{record_num}'] = record_value current_idx = next_record_pos # 生成INSERT语句,替换成你的目标表名 insert_stmt = f"""INSERT INTO your_target_table (FileID, Record1, Record2, Record3, Record4, Record5, Record6, Record7, Record8, Record9, Record10, Record11) VALUES ('{file_id}', '{record_map['Record1']}', '{record_map['Record2']}', '{record_map['Record3']}', '{record_map['Record4']}', '{record_map['Record5']}', '{record_map['Record6']}', '{record_map['Record7']}', '{record_map['Record8']}', '{record_map['Record9']}', '{record_map['Record10']}', '{record_map['Record11']}');""" print(insert_stmt) print('---')
运行这段代码后,会输出每条数据对应的INSERT语句,直接在数据库中执行这些语句就能完成插入。
方案二:用SQL行转列(Pivot)直接转换(如果原始数据已在数据库表中)
如果你的垂直数据已经存储在数据库的源表中(比如表名为source_records,结构为file_id VARCHAR, record_num INT, record_value VARCHAR),可以用SQL的行转列语法直接转换并插入。
针对SQL Server的PIVOT语法
INSERT INTO your_target_table (FileID, Record1, Record2, Record3, Record4, Record5, Record6, Record7, Record8, Record9, Record10, Record11) SELECT file_id AS FileID, [1] AS Record1, [2] AS Record2, [3] AS Record3, [4] AS Record4, [5] AS Record5, [6] AS Record6, [7] AS Record7, [8] AS Record8, [9] AS Record9, [10] AS Record10, [11] AS Record11 FROM ( SELECT file_id, record_num, record_value FROM source_records ) AS source_data PIVOT ( MAX(record_value) FOR record_num IN ([1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11]) ) AS pivot_data;
针对MySQL的条件聚合实现(MySQL无原生PIVOT)
INSERT INTO your_target_table (FileID, Record1, Record2, Record3, Record4, Record5, Record6, Record7, Record8, Record9, Record10, Record11) SELECT file_id AS FileID, MAX(CASE WHEN record_num = 1 THEN record_value END) AS Record1, MAX(CASE WHEN record_num = 2 THEN record_value END) AS Record2, MAX(CASE WHEN record_num = 3 THEN record_value END) AS Record3, MAX(CASE WHEN record_num = 4 THEN record_value END) AS Record4, MAX(CASE WHEN record_num = 5 THEN record_value END) AS Record5, MAX(CASE WHEN record_num = 6 THEN record_value END) AS Record6, MAX(CASE WHEN record_num = 7 THEN record_value END) AS Record7, MAX(CASE WHEN record_num = 8 THEN record_value END) AS Record8, MAX(CASE WHEN record_num = 9 THEN record_value END) AS Record9, MAX(CASE WHEN record_num = 10 THEN record_value END) AS Record10, MAX(CASE WHEN record_num = 11 THEN record_value END) AS Record11 FROM source_records GROUP BY file_id;
内容的提问来源于stack exchange,提问作者gene




