我们遇到了问题使用 RDS Mysql 的 Amazon 托管 Play 应用程序 http://www.mchange.com/projects/c3p0/#maxAdministrativeTaskTime.
有时应用程序最终会无响应。今天我使用 jstack -F 检查了线程状态,发现有多个线程卡在 c3p0 连接的本机代码中:
Thread 14060: (state = IN_NATIVE)
- java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 (Compiled frame; information may be imprecise)
- java.net.SocketInputStream.read(byte[], int, int, int) @bci=79, line=150 (Compiled frame)
- java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=121 (Compiled frame)
- com.mysql.jdbc.util.ReadAheadInputStream.fill(int) @bci=262, line=114 (Compiled frame)
- com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(byte[], int, int) @bci=176, line=161 (Compiled frame)
- com.mysql.jdbc.util.ReadAheadInputStream.read(byte[], int, int) @bci=48, line=189 (Compiled frame)
- com.mysql.jdbc.MysqlIO.readFully(java.io.InputStream, byte[], int, int) @bci=34, line=3036 (Compiled frame)
- com.mysql.jdbc.MysqlIO.reuseAndReadPacket(com.mysql.jdbc.Buffer, int) @bci=23, line=3489 (Compiled frame)
- com.mysql.jdbc.MysqlIO.reuseAndReadPacket(com.mysql.jdbc.Buffer) @bci=3, line=3478 (Compiled frame)
- com.mysql.jdbc.MysqlIO.checkErrorPacket(int) @bci=12, line=4019 (Compiled frame)
- com.mysql.jdbc.MysqlIO.sendCommand(int, java.lang.String, com.mysql.jdbc.Buffer, boolean, java.lang.String, int) @bci=428, line=2490 (Compiled frame)
- com.mysql.jdbc.MysqlIO.sqlQueryDirect(com.mysql.jdbc.StatementImpl, java.lang.String, java.lang.String, com.mysql.jdbc.Buffer, int, int, int, boolean, java.lang.String, com.mysql.jdbc.Field[]) @bci=559, line=2651 (Compiled frame)
- com.mysql.jdbc.ConnectionImpl.execSQL(com.mysql.jdbc.StatementImpl, java.lang.String, int, com.mysql.jdbc.Buffer, int, int, boolean, java.lang.String, com.mysql.jdbc.Field[], boolean) @bci=130, line=2677 (Compiled frame)
- com.mysql.jdbc.ConnectionImpl.execSQL(com.mysql.jdbc.StatementImpl, java.lang.String, int, com.mysql.jdbc.Buffer, int, int, boolean, java.lang.String, com.mysql.jdbc.Field[]) @bci=17, line=2627 (Compiled frame)
- com.mysql.jdbc.StatementImpl.executeQuery(java.lang.String) @bci=534, line=1556 (Compiled frame)
- com.mysql.jdbc.DatabaseMetaData$9.forEach(java.lang.Object) @bci=174, line=5013 (Compiled frame)
- com.mysql.jdbc.IterateBlock.doForAll() @bci=25, line=51 (Compiled frame)
- com.mysql.jdbc.DatabaseMetaData.getTables(java.lang.String, java.lang.String, java.lang.String, java.lang.String[]) @bci=280, line=4991 (Compiled frame)
- com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnectionNoQuery(java.sql.Connection, java.lang.Throwable[]) @bci=21, line=185 (Compiled frame)
- com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnection(java.sql.Connection, java.lang.String, java.lang.Throwable[]) @bci=7, line=62 (Compiled frame)
- com.mchange.v2.c3p0.AbstractConnectionTester.activeCheckConnection(java.sql.Connection) @bci=4, line=67 (Compiled frame)
- com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(java.lang.Object) @bci=122, line=368 (Compiled frame)
- com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.refurbishIdleResource(java.lang.Object) @bci=26, line=310 (Compiled frame)
- com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run() @bci=35, line=1999 (Interpreted frame)
- com.mchange.v2.async.ThreadPerTaskAsynchronousRunner$TaskThread.run() @bci=41, line=255 (Interpreted frame)
如果没有设置其他连接测试,DefaultConnectionTester.activeCheckConnectionNoQuery 似乎会检查 DatabaseMetaData.getTables。
C3p0手册有说明配置正确的连接测试 http://www.mchange.com/projects/c3p0/#configuring_connection_testing and 这个答案 https://stackoverflow.com/a/10357595/28482有 Play! 的 c3p0 设置说明但我想知道:
- Is there some way to check out what's the issue behing this?
- 我之前遇到过的标准“Mysql 连接在 8 小时后被终止”结果是异常
- There seems to be some default connection testing with c3p0/Play/Hibernate since this is done?
- I have a hunch that by default c3p0 doesn't do connection testing
- 默认值idleConnectionTestPeriod = 0、testConnectionOnCheckin = false、testConnectionOnCheckout = false
- Is there a way to set timeout for connection testing query?
- 例如,即使我配置了一些合理的 PreferredTestQuery,该查询是否也存在因当前 getTables() 调用而挂起的风险?
- 还有什么我应该考虑的吗?
使用自动测试表更新(不是解决方案)
我们设置 c3p0.automaticTestTable 以进行更合理的连接测试,但这似乎并没有解决问题,只是稍微更改了线程的堆栈跟踪。
在类似的情况下,我们现在的线程堆栈如下:
"Task-Thread-for-com.mchange.v2.async.ThreadPerTaskAsynchronousRunner@6064cd08" daemon prio=10 tid=0x00007f53d403a000 nid=0x63dd runnable [0x00007f53dce4a000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:114)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:161)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:189)
- locked <0x00000000dbecd0d8> (a com.mysql.jdbc.util.ReadAheadInputStream)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3036)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3489)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3478)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4019)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2490)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2677)
- locked <0x00000000dbec1a70> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2627)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1556)
- locked <0x00000000dbec1a70> (a com.mysql.jdbc.JDBC4Connection)
- locked <0x00000000d1330078> (a com.mysql.jdbc.StatementImpl)
at com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnection(DefaultConnectionTester.java:73)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(C3P0PooledConnectionPool.java:374)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.refurbishIdleResource(C3P0PooledConnectionPool.java:310)
at com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run(BasicResourcePool.java:1999)
at com.mchange.v2.async.ThreadPerTaskAsynchronousRunner$TaskThread.run(ThreadPerTaskAsynchronousRunner.java:255)
播放线程正在等待获取数据库连接:
"play-thread-3" prio=10 tid=0x00007f53c400b000 nid=0x5b3e in Object.wait() [0x00007f53ddd57000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000db9f38f8> (a com.mchange.v2.resourcepool.BasicResourcePool)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:579)
- locked <0x00000000db9f38f8> (a com.mchange.v2.resourcepool.BasicResourcePool)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
... (näitä vaihtelevat määrät)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
- locked <0x00000000db9f38f8> (a com.mchange.v2.resourcepool.BasicResourcePool)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
- locked <0x00000000db9f38f8> (a com.mchange.v2.resourcepool.BasicResourcePool)
at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:477)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:525)
at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:128)
at org.hibernate.ejb.connection.InjectedDataSourceConnectionProvider.getConnection(InjectedDataSourceConnectionProvider.java:71)
at org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:446)
at org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:167)
at org.hibernate.jdbc.JDBCContext.connection(JDBCContext.java:160)
at org.hibernate.transaction.JDBCTransaction.begin(JDBCTransaction.java:81)
at org.hibernate.impl.SessionImpl.beginTransaction(SessionImpl.java:1473)
at org.hibernate.ejb.TransactionImpl.begin(TransactionImpl.java:60)
at play.db.jpa.JPAPlugin.startTx(JPAPlugin.java:377)
at play.db.jpa.JPAPlugin.beforeInvocation(JPAPlugin.java:345)
at play.plugins.PluginCollection.beforeInvocation(PluginCollection.java:473)
at play.Invoker$Invocation.before(Invoker.java:217)
at play.Invoker$Invocation.run(Invoker.java:277)
at play.server.PlayHandler$NettyInvocation.run(PlayHandler.java:229)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
一些注意事项:
- 有可能(至少在理论上)许多连接恰好在读取堆栈转储的同时被测试,尽管我发现这不太可能。
- In another question https://stackoverflow.com/questions/9817681/a-thread-was-hanged-on-java-net-socketinputstream-socketread0native-method-an it was mentioned that data deadlocks or hardware/network problems might cause this kind of behaviour.
br、图子
回答我自己:
- No reason found for the connection hanging :(
- RDS 有 MySQL 的 error.log,不幸的是那里什么也没有
- Play has some default settings for c3p0 ComboPooledDataSource (in DBPlugin.onApplicationStart() )
- idleTimeExcessConnections :来自 application.conf db.pool.maxIdleTimeExcessConnections (默认 0)
- checkoutTimeout :来自 application.conf db.pool.timeout (默认 5000)
- 空闲连接测试周期:10
- setTestConnectionOnCheckin : true
- 另外conf/c3p0.properties可以用来设置自定义属性
- 尚未找到设置连接测试超时的方法(简短说明)默认连接测试器 http://www.mchange.com/projects/c3p0/#maxAdministrativeTaskTime源代码查找可能不支持)
我们继续设置c3p0.automaticTestTable,看看后面会不会有问题。
更新:不幸的是我们还没有摆脱问题:(同样使用automaticTestTable连接测试似乎陷入了java.net.SocketInputStream.socketRead0(Native Method)
再次回答自己:
我们最终尝试了使用 c3p0:s 的“设置连接测试查询超时”方法最大管理任务时间 http://www.mchange.com/projects/c3p0/#maxAdministrativeTaskTime。这有点麻烦,所以需要看看这是否可以成为最终的解决方案(如果有效)。
到目前为止还没有问题,我们将看看它是如何工作的。日志中也没有 c3p0 错误/堆栈跟踪,我希望日志中显示此中断?要么尚未发生超时,要么我们应该调整日志记录配置以显示 c3p0 日志。
更新(2013-04-04):此后问题(可能)不会经常发生,但时不时会发生:(