数栈数据质量模块,接入客户的数据源后,一执行就报错。报错信息:
{"logInfo": {{"jobid":"1a4ebbbd","msg_info":"2021-12-20 14:23:54:submit job is success"}} "engineLog": {com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
The last packet successfully received from the server was 43 milliseconds ago. The last packet sent successfully to the server was 70 milliseconds ago.}
通常这类错误是数据库连接空闲时间超过wait_timeout,连接被服务端回收。但是这个场景下,新建立的连接立刻就中断了。
产品功能运行几秒钟就会报错。
错误日志里有大量报错
2021-12-20T01:43:32.319449Z 102 [Note] Aborted connection 102 to db: 'idatatesta' user: 'acc' host: 'BF-202111221553' (Got an error reading communication packets)
2021-12-20T01:43:32.495966Z 103 [Note] Aborted connection 103 to db: 'idatatestb' user: 'acc' host: 'BF-202111221553' (Got an error reading communication packets)
2021-12-20T01:43:32.898337Z 104 [Note] Aborted connection 104 to db: 'idatatesta' user: 'acc' host: 'BF-202111221553' (Got an error reading communication packets)
2021-12-20T01:43:32.993156Z 105 [Note] Aborted connection 105 to db: 'idatatestb' user: 'acc' host: 'BF-202111221553' (Got an error reading communication packets)
2021-12-20T01:47:09.177221Z 108 [Note] Aborted connection 108 to db: 'idatatesta' user: 'acc' host: 'BF-202111221553' (Got an error reading communication packets)
2021-12-20T01:47:09.304512Z 109 [Note] Aborted connection 109 to db: 'idatatesta' user: 'acc' host: 'BF-202111221553' (Got an error reading communication packets)
2021-12-20T01:47:42.890847Z 110 [Note] Aborted connection 110 to db: 'idatatesta' user: 'acc' host: 'BF-202111221553' (Got an error reading communication packets)
使用这个数据源同步数据没有遇到问题。研发把数据质量的业务代码单独拉出来,做了一个测试案例,一执行就会报错。
import java.sql.*;
public class Test {
public static void main(String[] args) {
try {
test();
} catch (SQLException throwables) {
throwables.printStackTrace();
}
}
public static void test() throws SQLException {
String jdbcUrl = "jdbc:mysql://host:port/idatatesta?useUnicode=true&queryTimeout=2400&characterEncoding=utf-8&tinyInt1isBit=false&useSSL=false&serverTimezone=Asia/Shanghai&autoReconnect=true&failOverReadOnly=false&allowMultiQueries=true";
String username = "";
String password = "";
String processSql = "CREATE PROCEDURE xxxaabb() BEGIN\n" +
" DROP TABLE\n" +
" IF\n" +
" EXISTS idatatesta.temp_data_24c92b71;\n" +
" set @val_0 =(\n" +
" SELECT\n" +
" count( 1 ) AS null_count \n" +
" FROM\n" +
" `idatatesta`.`user` \n" +
" WHERE\n" +
" NAME IS NULL \n" +
" );\n" +
" CREATE TABLE\n" +
" IF\n" +
" NOT EXISTS idatatesta.temp_data_24c92b71 (\n" +
" `tenant_id` INT ( 11 ) NOT NULL DEFAULT '0',\n" +
" `monitor_id` INT ( 11 ) NOT NULL,\n" +
" `rule_id` INT ( 11 ) NOT NULL,\n" +
" `record_id` INT ( 11 ) NOT NULL,\n" +
" `val` VARCHAR ( 255 ) \n" +
" ) ENGINE = INNODB DEFAULT CHARSET = utf8;\n" +
" insert INTO idatatesta.temp_data_24c92b71 ( tenant_id, monitor_id, rule_id, record_id, val )\n" +
" VALUES\n" +
" ( 57, 507, 1623, 13819, @val_0 );\n" +
" drop TABLE\n" +
" IF\n" +
" EXISTS `idatatesta`.`dq_monitor_24c92b71`;\n" +
" CREATE TABLE `idatatesta`.`dq_monitor_24c92b71` (\n" +
" `id` INT ( 11 ) UNSIGNED NOT NULL COMMENT '编号',\n" +
" `name` VARCHAR ( 255 ) DEFAULT NULL COMMENT '姓名',\n" +
" `create_date` datetime DEFAULT NULL COMMENT '创建时间',\n" +
" `update_date` datetime DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP COMMENT '修改>时间',\n" +
" `sex` VARCHAR ( 4 ) DEFAULT NULL COMMENT '0:女,1:男' \n" +
" ) ENGINE = INNODB DEFAULT CHARSET = utf8 COMMENT = '用户表';\n" +
" ALTER TABLE `idatatesta`.`dq_monitor_24c92b71` ADD `dq_job_id` VARCHAR ( 100 );\n" +
" ALTER TABLE `idatatesta`.`dq_monitor_24c92b71` ADD `dq_rule_id` INT;\n" +
" ALTER TABLE `idatatesta`.`dq_monitor_24c92b71` ADD cyc_time datetime;\n" +
" ALTER TABLE `idatatesta`.`dq_monitor_24c92b71` ADD check_column VARCHAR ( 100 );\n" +
" INSERT INTO `idatatesta`.`dq_monitor_24c92b71` SELECT\n" +
" *,\n" +
" '24c92b71',\n" +
" 1623,\n" +
" '2021-12-24 13:52:26',\n" +
" 'name' \n" +
" FROM\n" +
" `idatatesta`.`user` \n" +
" WHERE\n" +
" NAME IS NULL;\n" +
"end;";
Connection connection = DriverManager.getConnection(jdbcUrl, username, password);
Statement statement = connection.createStatement();
System.out.println("XXX");
System.out.println(processSql);
statement.execute(processSql);
System.out.println("YYY");
//调用存储过程
String procCall = "call xxxaabb()";
CallableStatement callableStatement = connection.prepareCall(procCall);
System.out.println("ZZZ");
callableStatement.execute();
System.out.println("AAA");
}
}
测试代码很简单,就是创建一个存储过程,然后再调用存储过程。
代码在创建存储过程的时候就会抛异常。
使用最新版本的jdbc驱动测试,一样会报同样的错。
将创建存储过程的SQL单独复制出来,使用mysql客户端连接,创建存储过程,也会报错。
ERROR 2013 (HY000): Lost connection to MySQL server during query
用最简单的SQL测试,同样会报错。只要sql语句超过一定长度,就能复现问题。
mysql> select 'VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV' as c;
ERROR 2013 (HY000): Lost connection to MySQL server during query
客户的mysql安装在windows上,版本是5.7.35。我们在windows服务器上搭建一个类似的环境做测试,没有复现问题,排查mysql bug。
下载mysql windows软件包、
解压
初始化数据库
启动mysqld
修改root默认密码
mysqld --initialize
mysqld
默认密码在datadir下的错误日志中:
2021-12-30T07:07:58.587500Z 0 [Warning] CA certificate ca.pem is self signed.
2021-12-30T07:07:59.837500Z 1 [Note] A temporary password is generated for root@localhost: s.rntBi)D3(b
安装完成后测试,并没有复现问题。
客户一直说他的环境没有问题。我们登录客户环境,使用mysql客户端连接数据库,使用同样的SQL复现了问题。
客户之前使用navicat,没有复现问题,但是使用mysql命令行客户端很快就复现了问题。
客户环境下,使用127.0.0.1访问数据库,执行sql不会报错。
跟客户了解,使用了花生壳做内网穿透。客户基本也认可了问题出在花生壳上。
我们在自己的服务器上也安装了花生壳,尝试重现问题。然而并没有问题。
这里的映射类型是HTTP,
建议客户将映射类型改成TCP。修改后再测试,没有再出现执行长SQL就会连接断开的现象。
这是一个奇葩的问题。
我们的思路是
复现问题
用最简单的场景复现问题
对比差异(参数、网络环境、软件版本等)
定位和解决问题