在 Presto 上可以实现对 Icerberg 表的操作。LAS 默认配置了Presto和 Icerberg 的连接器。本文介绍在 Presto 中操作 Iceberg 表。
presto-cli --user hive --password cptatLnbyBQ5nPuP
详情参考 Presto官网。
CREATE TABLE iceberg.default.customer_orders ( order_id BIGINT, order_date DATE, account_number BIGINT, customer VARCHAR, country VARCHAR )WITH ( format = 'ORC', partitioning = ARRAY['bucket(account_number, 10)','country'] )
INSERT INTO iceberg.default.customer_orders VALUES (2, CAST('2021-01-12' AS DATE), 670011, 'bcd', 'ca'); INSERT INTO iceberg.default.customer_orders VALUES (2, CAST('2021-01-12' AS DATE), 670012, 'bcde', 'cn');
查询语句示例:
SELECT * FROM iceberg.default.customer_orders;
输出示例:
order_id | order_date | account_number | customer | country ----------+------------+----------------+----------+--------- 1 | 2021-10-11 | 1002123 | abc | us 2 | 2021-01-12 | 670011 | bcd | ca 2 | 2022-02-11 | 9070011 | efg | cn 2 | 2022-01-01 | 560012 | zxc | us (4 rows)
增加筛选条件:
SELECT * FROM iceberg.default.customer_orders WHERE account_number = 670012;
EXPLAIN ANALYZE SELECT * SELECT iceberg.default.customer_orders WHERE account_number = 670012;
DESC customer_orders;
查看快照:
SELECT * FROM "customer_orders$snapshots"
输出示例:
committed_at | snapshot_id | parent_id | operation |---------------------------------------+---------------------+---------------------+-----------+------------------------------------------------------------------2022-01-13 13:34:00.657 Asia/Shanghai | 1942301704603642054 | NULL | append | hdfs://las-master-1-1:8020/warehouse/tablespace/managed/hive/custom2022-01-13 13:39:18.665 Asia/Shanghai | 2127546335909550653 | 1942301704603642054 | append | hdfs://las-master-1-1:8020/warehouse/tablespace/managed/hive/custom2022-01-17 19:56:28.807 Asia/Shanghai | 6215604771056148566 | 2127546335909550653 | append | hdfs://las-master-1-1:8020/warehouse/tablespace/managed/hive/custom2022-01-17 19:57:12.160 Asia/Shanghai | 7079051850206849569 | 6215604771056148566 | append | hdfs://las-master-1-1:8020/warehouse/tablespace/managed/hive/custom2022-01-17 19:57:55.779 Asia/Shanghai | 3727192433427494890 | 7079051850206849569 | append | hdfs://las-master-1-1:8020/warehouse/tablespace/managed/hive/custom (5 rows)
时间旅行查询:
SELECT count(*) FROM "customer_orders@6215604771056148566"; _col0 -------2 (1 row) SELECT count(*) FROM "customer_orders@3727192433427494890"; _col0 -------4 (1 row)
CALL iceberg.system.rollback_to_snapshot('default', 'customer_orders', 3727192433427494890);
SELECT * FROM "customer_orders$partitions";
partition | record_count | file_count | total_size |---------------------------------------+--------------+------------+------------+--------------------------------------------------------------------------------- {account_number_bucket=9, country=us} | 1 | 1 | 714 | {order_id={min=2, max=2, null_count=0}, order_date={min=2022-01-01, max=2022-01- {account_number_bucket=2, country=ca} | 1 | 1 | 711 | {order_id={min=2, max=2, null_count=0}, order_date={min=2021-01-12, max=2021-01- {account_number_bucket=1, country=us} | 1 | 1 | 711 | {order_id={min=1, max=1, null_count=0}, order_date={min=2021-10-11, max=2021-10- {account_number_bucket=5, country=cn} | 1 | 1 | 716 | {order_id={min=2, max=2, null_count=0}, order_date={min=2022-02-11, max=2022-02-