问题现象:
Hive2ByteHouse CE 离线同步任务,执行失败,报错:DB::Exception: Column day specified more than once SQLSTATE: 42701,详细错误日志如下:
at java.lang.Thread.run(Thread.java:748) - java.sql.BatchUpdateException: Code: 15, e.displayText() = DB::Exception: Column day specified more than once SQLSTATE: 42701 (version 21.9.1) , server ClickHouseNode [uri=http://7442209169657188635.bytehouse-ce.ivolces.com:8123/acme_db, options={custom_settings=custom_gw_force_ck_node=9.0.0.8}]@1089714795 at com.clickhouse.jdbc.SqlExceptionUtils.batchUpdateError(SqlExceptionUtils.java:107) at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.executeAny(InputBasedPreparedStatement.java:154) at com.clickhouse.jdbc.internal.AbstractPreparedStatement.executeLargeBatch(AbstractPreparedStatement.java:85) at com.clickhouse.jdbc.internal.ClickHouseStatementImpl.executeBatch(ClickHouseStatementImpl.java:754) at com.bytedance.bitsail.bytehouse.sink.record.AppendRecordsWriter.flush(AppendRecordsWriter.java:69) at com.bytedance.bitsail.bytehouse.sink.record.DelegatePartitionRecordsWriter.flush(DelegatePartiti
解决方案
报错原因:该离线同步任务,写入 ByteHouse CE 表指定了静态分区字段 day,但在表字段映射中也同时存在 day 字段,因字段重复而导致同步任务执行异常。
解决方案:您需将同步任务字段映射中的 day 字段映射删除。
问题现象:
LAS2Kafka 离线通道任务执行失败,TaskManager 日志报错提示关键字:NOT_ENOUGH_REPLICAS,详细错误日志如下:
TaskManager错误日志提示: Got error produce response with correlation id 6 on topic-partition dedao_log_backtrace-22, retrying (9 attempts left). Error: NOT_ENOUGH_REPLICAS
解决方案:
报错原因:Kafka 集群实例中副本数异常
解决方案:调整 Kafka 集群副本参数,将 min.insync.replicas 设置为 1。以火山引擎 Kafka 集群为例,您可在对应 Topic 的配置中修改最小同步副本个数为 1。操作详见 Kafka 修改参数配置。
问题现象:
Kafka2MySQL 流式集成任务,错误日志提示:Duplicate entry 'oXXXXX-V45Y' for key 'xxxx_union_id_IDX'。
关键字:Duplicate entry,详细错误日志如下:
Caused by: org.apache.flink.util.SerializedThrowable: Failed to insert record: 7,oXkOs6CBY8l2b8t6RidihNQ-V45Y,,,5,,,7881299552022540ErrMsg: State = [23000], ErrorCode = [1062], Message = [Duplicate entry 'oXkOs6CBY8l2b8t6RidihNQ-V45Y' for key 'wework_contact_intention_union_id_IDX'] at com.bytedance.dts.batch.jdbc.JDBCOutputFormat.doSingleInsert(JDBCOutputFormat.java:728) ~[dts-jdbc-sink.jar:?] at com.bytedance.dts.batch.jdbc.JDBCOutputFormat.flush(JDBCOutputFormat.java:639) ~[dts-jdbc-sink.jar:?] at com.bytedance.dts.batch.jdbc.JDBCOutputFormat.flushRecordsBufferInternal(JDBCOutputFormat.java:894) ~[dts-jdbc-sink.jar:?] at com.bytedance.dts.core.legacy.connector.OutputFormatPlugin.snapshotState(OutputFormatPlugin.java:477) ~[dts-streaming-core.jar:?] at com.bytedance.dts.core.legacy.connector.OutputFormatSinkFunction.snapshotState(OutputFormatSinkFunction.java:48) ~[dts-streaming-core.jar:?] at org.apache.flink.streaming.util.functions.StreamingFunctionU
解决方案:
报错原因:MySQL 表存在多个唯一索引时,可能出现一条数据具有两个唯一键的情况。在分表与 MySQL 中,可能存在两条重复数据,修改其中一条数据会与另一条的主键冲突;若仅为一条数据,则可直接进行update操作。根本原因在于任务配置字段映射时采用了转换模式,转换后的字段与目标表字段顺序不一致,使得数据写入 MySQL 时,其中一条数据的主键为空,进而出现与其他字段主键冲突的情况。
解决方案:调整转换模式后的字段顺序,使其与目标表字段顺序一致,消除数据冲突问题。
问题现象:
MySQL2ByteHouse CE 离线通道任务,报错:Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Public Key Retrieval is not allowed
关键字:Public Key Retrieval is not allowed,详细错误日志如下:
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Public Key Retrieval is not allowed at sun.reflect.GeneratedConstructorAccessor34.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:403) at com.mysql.jdbc.Util.getInstance(Util.java:386) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:919) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:898) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:887) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:861) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:878) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:874) at com.mysql.jdbc.MysqlIO.proceedHandshakeWithPluggableAuthentication(MysqlIO.java:1770) at com.mysql.jdbc.MysqlIO.doHandshake(MysqlIO.java:1217) at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2189) at com.mysql.jdbc.ConnectionImpl.connectWithRetries(ConnectionImpl.java:2036)
解决方案:
数据源配置时填写的用户名,其在数据库中仅拥有只读权限,无法获取公钥,导致读取失败。您需给对应的用户授予库表的读写权限后,便可正常执行任务。
问题现象:
TIDB2ByteHouse CE 离线通道任务,报错:Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: tidb-server instance and this query is currently using the most memory,详细错误日志如下:
java.sql.SQLException: IndexMergeTableScanWorker: Your query has been cancelled due to exceeding the allowed memory limit for the tidb-server instance and this query is currently using the most memory. Please try narrowing your query scope or increase the tidb_server_memory_limit and try again.[conn=5629299423096865299] at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:965) ~[dts-mysql5-core.jar:?] at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3933) ~[dts-mysql5-core.jar:?] at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3869) ~[dts-mysql5-core.jar:?] at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2524) ~[dts-mysql5-core.jar:?] at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2675) ~[dts-mysql5-core.jar:?] at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2465) ~[dts-mysql5-core.jar:?] at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2439) ~[dts-mysql5-core.jar:?] at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1365) ~[dts-mysql5-core.jar:?] at org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208) ~[dts-jdbc-source.jar:?] at org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208) ~[dts-jdbc-source.jar:?] at org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208) ~[dts-jdbc-source.jar:?] at com.bytedance.dts.batch.jdbc.split.SplitOneShardCallable.getMaxOrMinPrimaryKey(SplitOneShardCallable.java:315) ~[dts-jdbc-source.jar:?] at com.bytedance.dts.batch.jdbc.split.SplitOneShardCallable.call(SplitOneShardCallable.java:174) [dts-jdbc-source.jar:?] at com.bytedance.dts.batch.jdbc.split.SplitOneShardCallable.call(SplitOneShardCallable.java:61) [dts-jdbc-source.jar:?] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181] com.bytedance.dts.batch.jdbc.split.AccurateSplitOneShardCallable JDBC fetch range info pool [] - [Fetch range query. Shard: (shard num: 0 10.53.4.207:6033 jdbc:mysql://address=(protocol=tcp)(host=xx.xx.x.xxx)(port=6033)/platform_finance?useUnicode=true&connectTimeout=10000&allowMultiQueries=true&zeroDateTimeBehavior=convertToNull&characterEncoding=utf-8&autoReconnect=true&useSSL=false) Table: `t_xxx_detail`] Execute SELECT max(`id`) FROM (SELECT `id` FROM %s WHERE ((create_time >= '2024-12-14 00:00:00' and create_time <= '2024-12-14 23:59:59') or (update_time >= '2024-12-14 00:00:00' and update_time <= '2024-12-14 23:59:59')) AND `id`> ? ORDER BY `id` LIMIT 50000) t , till now, sleep times: 5, sleep interval: 1000ms. Fetch query times 18, current primary key is 2038033 and max primary key is 262474296, MySQL range split progress 0.8%
解决方案:
报错原因:离线任务根据设置的切分键计算分片时,因数据量过大,导致机器内存不足,导致任务执行失败。
解决方案:尝试任务配置时去掉切分键,等待任务执行完成;或增加机器内存分配。
问题现象:
TOS2MongoDB 离线通道任务,同步 CSV 文件数据,读部分列时,日志报错提示 Description:[It is an illegal text content]. java.lang.ArrayIndexOutOfBoundsException: 1。
关键字:ArrayIndexOutOfBoundsException,详细日志如下:
2024-11-11 19:47:50,073 INFO com.bytedance.bitsail.base.messenger.checker.DirtyRecordChecker flink-akka.actor.default-dispatcher-2 [] - Group READER found 453 dirty records, threshold 0. They are: Row: [{"byteSize":144,"rawData":"\"{\"\"source\"\": \"\"实时热点榜单\"\", \"\"title\"\": \"\"乌称xxxxx\"\", \"\"date\"\": \"\"2024-11-05\"\", \"\"hour\"\": \"\"14\"\", \"\"id\"\": \"\"1\"\", \"\"created_at\"\": 1730966226}\""},null,null,null]. message: [Code:[DtsParser-07], Description:[It is an illegal text content]. - value: {"source": "实时热点榜单", "title": "乌称xxxxx", "date": "2024-11-05", "hour": "14", "id": "1", "created_at": 1730966226} - java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.commons.csv.CSVRecord.get(CSVRecord.java:79) at com.bytedance.dts.batch.file.parser.CsvBytesParser.parse(CsvBytesParser.java:128) at com.bytedance.dts.batch.file.parser.CsvBytesParser.parse(CsvBytesParser.java:104) at com.bytedance.dts.batch.common.row.TextRowBuilder.build(TextRowBuilder.java:51) at com.bytedance.bitsail.component.format.api.RowBuilder.build(RowBuilder.java:61) at com.bytedance.dts.batch.file.hadoop.mapred.HadoopInputFormat.buildRow(HadoopInputFormat.java:179) at com.bytedance.dts.core.legacy.connector.InputFormatPlugin.nextRecord(InputFormatPlugin.java:334) at com.bytedance.dts.core.legacy.connector.InputFormatPlugin.nextRecord(InputFormatPlugin.java:67) at com.bytedance.dts.core.legacy.connector.InputFormatSourceFunction.lambda$run$60(InputFormatSourceFunction.java:120) at com.bytedance.bitsail.component.format.security.kerberos.module.HadoopSecurityModule.doAs(HadoopSecurityModule.java:133) at com.bytedance.dts.core.security.kerberos.HadoopKerberosSecured.doAs(HadoopKerberosSecured.java:204) at com.bytedance.dts.core.legacy.connector.InputFormatSourceFunction.run(InputFormatSourceFunction.java:102) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:207) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:122) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:217)
解决方案:
报错原因:任务读取的源文件格式是 JSON 格式,但任务配置时,数据类型选择了 CSV 格式,导致任务执行异常。
解决方案:TOS 读选择数据类型时,选择正确的 JSON 数据格式 或将 TOS 路径下的源文件修改为 CSV 格式的文件。