redis连接池管理
起因
整件事情的起因其实是一个在生产环境部署的服务报了如下的redis错误 JedisConnectionException: Unexpected end of stream.
。该服务接受消息,通过EHINCRBY指令进行访问统计,偶尔会出现如下错误。该业务还并未正式上线,写入的qps相当低(几分钟一条),而使用的redis集群又是独占的集群,不太可能会出现性能瓶颈,觉得很奇怪。
redis.clients.jedis.exceptions.JedisConnectionException: Unexpected end of stream.
at redis.clients.jedis.util.RedisInputStream.ensureFill(RedisInputStream.java:202) ~[jedis-xmly-ext-3.1.1-SNAPSHOT.jar!/:na]
at redis.clients.jedis.util.RedisInputStream.readByte(RedisInputStream.java:43) ~[jedis-xmly-ext-3.1.1-SNAPSHOT.jar!/:na]
at redis.clients.jedis.Protocol.process(Protocol.java:154) ~[jedis-xmly-ext-3.1.1-SNAPSHOT.jar!/:na]
at redis.clients.jedis.Protocol.read(Protocol.java:219) ~[jedis-xmly-ext-3.1.1-SNAPSHOT.jar!/:na]
at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:309) ~[jedis-xmly-ext-3.1.1-SNAPSHOT.jar!/:na]
at redis.clients.jedis.Connection.getIntegerReply(Connection.java:260) ~[jedis-xmly-ext-3.1.1-SNAPSHOT.jar!/:na]
at redis.clients.jedis.Jedis.ehincrBy(Jedis.java:955) ~[jedis-xmly-ext-3.1.1-SNAPSHOT.jar!/:na]
...
排查过程
-
排查服务端问题
在goole上搜索该异常,看了几个网页都说是如下问题。
- 缓冲区满
- 由于请求量极小,并没有出现缓冲区满的问题
- server timeout设置不合理
- timeout设置为75s
- 网络不稳定 于是找了redis server端进行排查,server端看到所有的EHINCRBY指令都已正常执行完毕。因此转而排查客户端连接问题。
- 缓冲区满
-
排查客户端连接问题
思考该异常一般出现在连接失败的情况下,考虑连接失败的情况,又考虑到请求量极少,于是想到是空闲连接被redis server关闭,但还在客户端使用导致的。有关的redis配置为 空闲资源监测配置属性
- maxIdle最大空闲和最小空闲,如果最大空闲为0,则不存在空闲连接,每次都会创建连接,结束后销毁连接。
- testWhileIdle 为true会开启监测
- timeBetweenEvictionRunsMillis 空闲资源的检测周期,单位为毫秒,默认值:-1,表示不检测,建议设置一个合理的值,周期性运行监测任务4.
- softMinEvictableIdleTimeMillis 资源池中的自由最小空闲时间,单位为ms, 当达到该值且空闲连接数大于minIdle时移除,软移除
- minEvictableIdleTimeMillis 资源池中资源最小空闲时间,单位为毫秒,默认值:30分钟(1000 60L 30L),当达到该值后空闲资源将被移除,建议根据业务自身设定
- numTestsPerEvictionRun 做空闲资源检测时,每次的采样数,默认值:3,可根据自身应用连接数进行微调,如果设置为 -1,表示对所有连接做空闲监测
最后发现的问题是项目中timeBetweenEvictionRunsMillis配置了-1,maxIdle为20,因此允许空闲连接存在,但没有进行清理,导致客户端使用失效的连接进行连接导致,将该参数改为30s,比timeout75s短解决了问题
redis pool连接管理
基于该问题,整理了一下redis pool如何管理连接
-
创建连接池
public GenericObjectPool(final PooledObjectFactory<T> factory, final GenericObjectPoolConfig<T> config) { super(config, ONAME_BASE, config.getJmxNamePrefix()); if (factory == null) { jmxUnregister(); // tidy up throw new IllegalArgumentException("factory may not be null"); } this.factory = factory; idleObjects = new LinkedBlockingDeque<>(config.getFairness()); setConfig(config); }
-
开启检测(我们遇到的问题就是在这没有启动Evictor)
该方法在1中setConfig->setTimeBetweenEvictionRunsMillis中调用,使用配置timeBetweenEvictionRunsMillis创建失效连接检测器
final void startEvictor(final long delay) { synchronized (evictionLock) { if (evictor == null) { // Starting evictor for the first time or after a cancel if (delay > 0) { // Starting new evictor evictor = new Evictor(); EvictionTimer.schedule(evictor, delay, delay); } } else { // Stop or restart of existing evictor if (delay > 0) { // Restart synchronized (EvictionTimer.class) { // Ensure no cancel can happen between cancel / schedule calls EvictionTimer.cancel(evictor, evictorShutdownTimeoutMillis, TimeUnit.MILLISECONDS, true); evictor = null; evictionIterator = null; evictor = new Evictor(); EvictionTimer.schedule(evictor, delay, delay); } } else { // Stopping evictor EvictionTimer.cancel(evictor, evictorShutdownTimeoutMillis, TimeUnit.MILLISECONDS, false); } } } }
-
evictor定时调用
evictor中的run函数
@Override public void run() { final ClassLoader savedClassLoader = Thread.currentThread().getContextClassLoader(); try { if (factoryClassLoader != null) { // Set the class loader for the factory final ClassLoader cl = factoryClassLoader.get(); if (cl == null) { // The pool has been dereferenced and the class loader // GC'd. Cancel this timer so the pool can be GC'd as // well. cancel(); return; } Thread.currentThread().setContextClassLoader(cl); } // Evict from the pool try { evict(); } catch(final Exception e) { swallowException(e); } catch(final OutOfMemoryError oome) { // Log problem but give evictor thread a chance to continue // in case error is recoverable oome.printStackTrace(System.err); } // Re-create idle instances. try { ensureMinIdle(); } catch (final Exception e) { swallowException(e); } } finally { // Restore the previous CCL Thread.currentThread().setContextClassLoader(savedClassLoader); } }
-
实际检测
evict()由GenericObjectPool实现,调用EvictionPolicy.evict
public void evict() throws Exception { //... PooledObject<T> underTest = null; final EvictionPolicy<T> evictionPolicy = getEvictionPolicy(); synchronized (evictionLock) { final EvictionConfig evictionConfig = new EvictionConfig( getMinEvictableIdleTimeMillis(), getSoftMinEvictableIdleTimeMillis(), getMinIdle()); final boolean testWhileIdle = getTestWhileIdle(); for (int i = 0, m = getNumTests(); i < m; i++) { if (evictionIterator == null || !evictionIterator.hasNext()) { evictionIterator = new EvictionIterator(idleObjects); } if (!evictionIterator.hasNext()) { // Pool exhausted, nothing to do here return; } try { underTest = evictionIterator.next(); } catch (final NoSuchElementException nsee) { // Object was borrowed in another thread // Don't count this as an eviction test so reduce i; i--; evictionIterator = null; continue; } if (!underTest.startEvictionTest()) { // Object was borrowed in another thread // Don't count this as an eviction test so reduce i; i--; continue; } // User provided eviction policy could throw all sorts of // crazy exceptions. Protect against such an exception // killing the eviction thread. boolean evict; try { evict = evictionPolicy.evict(evictionConfig, underTest, idleObjects.size()); } catch (final Throwable t) { // Slightly convoluted as SwallowedExceptionListener // uses Exception rather than Throwable PoolUtils.checkRethrow(t); swallowException(new Exception(t)); // Don't evict on error conditions evict = false; } //... }
-
实际判断
可以看到满足以下条件时清理
- 空闲时间>IdleSoftEvictTime 同时idleCount > MinIdle
- 空闲时间>IdleEvictTime
@Override public boolean evict(final EvictionConfig config, final PooledObject<T> underTest, final int idleCount) { if ((config.getIdleSoftEvictTime() < underTest.getIdleTimeMillis() && config.getMinIdle() < idleCount) || config.getIdleEvictTime() < underTest.getIdleTimeMillis()) { return true; } return false; }
参考文章
(文章)[https://www.modb.pro/db/21488]