使用ADBC Flight SQL查询StarRocks时JOIN语句返回Arrow表列名为空
ADBC Flight SQL查询StarRocks JOIN语句后列名丢失问题
使用ADBC Flight SQL驱动查询StarRocks数据库时,单表SELECT查询可正常返回带正确列名的Arrow表,但只要查询包含JOIN语句,返回的Arrow表中所有列名均为Null。请问这是操作失误还是存在Bug?问题出在Flight SQL驱动还是StarRocks中?
正常情况
单表查询可正确返回含id列名的pyarrow表,符合预期:
with flightsql.connect(uri=uri, db_kwargs={adbc_driver_manager.DatabaseOptions.USERNAME.value: user, adbc_driver_manager.DatabaseOptions.PASSWORD.value: pw}) as conn: with conn.cursor() as cursor: cursor.execute("USE database;") cursor.execute("SELECT table.id FROM table;") arrow_table = cursor.fetchallarrow()
>>> arrow_table pyarrow.Table id: int32 not null ---- id: [[1,2,3],...,[97,98,99]]
异常情况
只要查询包含JOIN语句,返回的Arrow表中所有列名均为Null,即使SELECT中未包含关联表的列也是如此。cursor.fetchallarrow()和cursor.fetch_arrow_table()结果一致:
with flightsql.connect(uri=uri, db_kwargs={adbc_driver_manager.DatabaseOptions.USERNAME.value: user, adbc_driver_manager.DatabaseOptions.PASSWORD.value: pw}) as conn: with conn.cursor() as cursor: cursor.execute("USE database;") cursor.execute("SELECT table1.id FROM table1 JOIN table2 ON table1.id = table2.id;") arrow_table = cursor.fetchallarrow() # arrow_table = cursor.fetch_arrow_table()
>>> arrow_table pyarrow.Table : int32 not null ---- : [[1,2,3],...,[97,98,99]]
后续调用pl.from_arrow(arrow_table)时,Polars会将空列名替换为通用名称(column_01、column_02等)。
版本信息
- Python 3.11.2
- adbc-driver-flightsql 1.8.0
- adbc-driver-manager 1.8.0
- StarRocks 3.5.5
内容的提问来源于stack exchange,提问作者usdn




