一次線上mongo慢查詢問題排查處理記錄
一、簡單介紹
mongo語句查詢條件中出現(xiàn)null或空串,可能會導(dǎo)致索引失效,查詢優(yōu)化器無法選擇正確的查詢計(jì)劃,出現(xiàn)慢查詢引起服務(wù)異常
mongo查詢的執(zhí)行計(jì)劃使用了LRU緩存,在很多種情況下會失效,導(dǎo)致重新選擇執(zhí)行計(jì)劃并緩存,供后續(xù)同類查詢直接使用;
服務(wù)開始時一直運(yùn)行正常,當(dāng)執(zhí)行計(jì)劃失效后,恰好出現(xiàn)null值的查詢導(dǎo)致選擇了錯誤的執(zhí)行計(jì)劃并緩存,后續(xù)正常的查詢也會出現(xiàn)異常。
二、事件脈絡(luò)
1、起因
用戶中心收到業(yè)務(wù)方反饋,第三方登錄注冊出現(xiàn)頻繁dubbo調(diào)用超時。
org.apache.dubbo.rpc.RpcException: Failed to invoke the method loginWithThird in the service weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService. Tried 1 times of the providers [10.65.5.0:11090] (1/4) from the registry node1.zk.all.platform.wtc.hwhosts.com:2181 on the consumer 10.65.1.81 using the dubbo version 2.7.3-SNAPSHOT. Last error is: loginWithThird_2 failed and fallback failed.
at org.apache.dubbo.rpc.cluster.support.FailoverClusterInvoker.doInvoke(FailoverClusterInvoker.java:113)
at org.apache.dubbo.rpc.cluster.support.AbstractClusterInvoker.invoke(AbstractClusterInvoker.java:248)
at org.apache.dubbo.rpc.cluster.support.wrapper.MockClusterInvoker.invoke(MockClusterInvoker.java:78)
at org.apache.dubbo.rpc.proxy.InvokerInvocationHandler.invoke(InvokerInvocationHandler.java:55)
at org.apache.dubbo.common.bytecode.proxy9.loginWithThird(proxy9.java)
at weli.peanut.user.service.LoginService.loginAndRegister(LoginService.java:684)
at weli.peanut.user.service.LoginService.loginByThirdPartyV2(LoginService.java:629)
at weli.peanut.web.controller.api.UserLoginController.lambda$loginByThirdPartyV2$0(UserLoginController.java:113)
at weli.peanut.common.controller.BaseController.resultForActionWithCat(BaseController.java:71)
at weli.peanut.web.controller.api.UserLoginController.loginByThirdPartyV2(UserLoginController.java:113)
at sun.reflect.GeneratedMethodAccessor2017.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:116)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:963)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:897)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:646)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:197)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:501)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1721)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1679)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.netflix.hystrix.exception.HystrixRuntimeException: loginWithThird_2 failed and fallback failed.
at com.netflix.hystrix.AbstractCommand$22.call(AbstractCommand.java:818)
at com.netflix.hystrix.AbstractCommand$22.call(AbstractCommand.java:793)
at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$4.onError(OperatorOnErrorResumeNextViaFunction.java:140)
at rx.internal.operators.OnSubscribeDoOnEach$DoOnEachSubscriber.onError(OnSubscribeDoOnEach.java:87)
at rx.internal.operators.OnSubscribeDoOnEach$DoOnEachSubscriber.onError(OnSubscribeDoOnEach.java:87)
at com.netflix.hystrix.AbstractCommand$DeprecatedOnFallbackHookApplication$1.onError(AbstractCommand.java:1454)
at com.netflix.hystrix.AbstractCommand$FallbackHookApplication$1.onError(AbstractCommand.java:1379)
at rx.internal.operators.OnSubscribeDoOnEach$DoOnEachSubscriber.onError(OnSubscribeDoOnEach.java:87)
at rx.observers.Subscribers$5.onError(Subscribers.java:230)
at rx.internal.operators.OnSubscribeThrow.call(OnSubscribeThrow.java:44)
at rx.internal.operators.OnSubscribeThrow.call(OnSubscribeThrow.java:28)
at rx.Observable.unsafeSubscribe(Observable.java:10151)
at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:51)
at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35)
at rx.Observable.unsafeSubscribe(Observable.java:10151)
at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:41)
at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:30)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
at rx.Observable.unsafeSubscribe(Observable.java:10151)
at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:41)
at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:30)
at rx.Observable.unsafeSubscribe(Observable.java:10151)
at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:41)
at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:30)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
at rx.Observable.unsafeSubscribe(Observable.java:10151)
at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:41)
at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:30)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
at rx.Observable.unsafeSubscribe(Observable.java:10151)
at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$4.onError(OperatorOnErrorResumeNextViaFunction.java:142)
at rx.internal.operators.OnSubscribeDoOnEach$DoOnEachSubscriber.onError(OnSubscribeDoOnEach.java:87)
at rx.internal.operators.OnSubscribeDoOnEach$DoOnEachSubscriber.onError(OnSubscribeDoOnEach.java:87)
at rx.internal.operators.OperatorSubscribeOn$1$1.onError(OperatorSubscribeOn.java:59)
at rx.observers.Subscribers$5.onError(Subscribers.java:230)
at rx.internal.operators.OnSubscribeDoOnEach$DoOnEachSubscriber.onError(OnSubscribeDoOnEach.java:87)
at rx.observers.Subscribers$5.onError(Subscribers.java:230)
at com.netflix.hystrix.AbstractCommand$DeprecatedOnRunHookApplication$1.onError(AbstractCommand.java:1413)
at com.netflix.hystrix.AbstractCommand$ExecutionHookApplication$1.onError(AbstractCommand.java:1344)
at rx.observers.Subscribers$5.onError(Subscribers.java:230)
at rx.observers.Subscribers$5.onError(Subscribers.java:230)
at rx.internal.operators.OnSubscribeThrow.call(OnSubscribeThrow.java:44)
at rx.internal.operators.OnSubscribeThrow.call(OnSubscribeThrow.java:28)
at rx.Observable.unsafeSubscribe(Observable.java:10151)
at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:51)
at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
at rx.Observable.unsafeSubscribe(Observable.java:10151)
at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:51)
at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35)
at rx.Observable.unsafeSubscribe(Observable.java:10151)
at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:41)
at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:30)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
at rx.Observable.unsafeSubscribe(Observable.java:10151)
at rx.internal.operators.OperatorSubscribeOn$1.call(OperatorSubscribeOn.java:94)
at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:56)
at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:47)
at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction.call(HystrixContexSchedulerAction.java:69)
at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
... 3 more
Caused by: java.lang.RuntimeException: org.apache.dubbo.rpc.RpcException: Invoke remote method timeout. method: loginWithThird, provider: dubbo://10.65.5.0:11090/weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService?anyhost=true&application=peanut-admin&bean.name=weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService&check=false&deprecated=false&dubbo=2.0.2&dynamic=true&generic=false&interface=weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService&lazy=false&loadbalance=leastactive&methods=logout,sendSms,loginWithVisitor,loginWithThird,loginWithPhone,loginWithThirdBindPhone&pid=1®ister=true®ister.ip=10.65.1.81&release=2.7.3-SNAPSHOT&remote.application=wormhole-api&retries=0&revision=1.1.11-SNAPSHOT&side=consumer&sticky=false&timeout=5000×tamp=1647332146309, cause: org.apache.dubbo.remoting.TimeoutException: Waiting server-side response timeout by scan timer. start time: 2022-03-15 20:33:31.937, end time: 2022-03-15 20:33:36.968, client elapsed: 0 ms, server elapsed: 5031 ms, timeout: 5000 ms, request: Request [id=690506, version=2.0.2, twoway=true, event=false, broken=false, data=RpcInvocation [methodName=loginWithThird, parameterTypes=[class weli.wormhole.rpc.user.center.model.LoginWithThirdRequest, class weli.wormhole.rpc.user.center.model.LoginCommonParam], arguments=[LoginWithThirdRequest(accessToken=54_y3n-YsttdKOklr_X-mbjRgKAKROLDKfx3PV23zAQHmBvmT3zVliFp9mWodHSUEcKgNnJGC4AP_bAOX-2ZSSsfCHfNM5wfjbJDyiQh-sCjFQ, openId=ofrsY6xraPjv9augA9L2kUIB0-Fk, unionId=null, code=null, type=WX, nickName=null, headImgurl=null, sex=null, skipPhoneBind=1), LoginCommonParam(channel=vivo, ip=null, verName=null, platform=null, deviceId=null, imei=null, app=peanut, subApp=peanut, dfId=DUE_59hD2L9BBgukNR3Jv2DjnW00iRtU8z0eRFVFXzU5aEQyTDlCQmd1a05SM0p2MkRqblcwMGlSdFU4ejBlc2h1, idfa=null, mac=null, oaid=null, androidId=null)], attachments={path=weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService, interface=weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService, version=0.0.0, timeout=5000}]], channel: /10.65.1.81:35450 -> /10.65.5.0:11090
at suishen.esb.hystrix.dubbo.DubboHystrixCommand.run(DubboHystrixCommand.java:93)
at suishen.esb.hystrix.dubbo.DubboHystrixCommand.run(DubboHystrixCommand.java:12)
at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:301)
at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:297)
at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:46)
... 26 more
Caused by: org.apache.dubbo.rpc.RpcException: Invoke remote method timeout. method: loginWithThird, provider: dubbo://10.65.5.0:11090/weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService?anyhost=true&application=peanut-admin&bean.name=weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService&check=false&deprecated=false&dubbo=2.0.2&dynamic=true&generic=false&interface=weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService&lazy=false&loadbalance=leastactive&methods=logout,sendSms,loginWithVisitor,loginWithThird,loginWithPhone,loginWithThirdBindPhone&pid=1®ister=true®ister.ip=10.65.1.81&release=2.7.3-SNAPSHOT&remote.application=wormhole-api&retries=0&revision=1.1.11-SNAPSHOT&side=consumer&sticky=false&timeout=5000×tamp=1647332146309, cause: org.apache.dubbo.remoting.TimeoutException: Waiting server-side response timeout by scan timer. start time: 2022-03-15 20:33:31.937, end time: 2022-03-15 20:33:36.968, client elapsed: 0 ms, server elapsed: 5031 ms, timeout: 5000 ms, request: Request [id=690506, version=2.0.2, twoway=true, event=false, broken=false, data=RpcInvocation [methodName=loginWithThird, parameterTypes=[class weli.wormhole.rpc.user.center.model.LoginWithThirdRequest, class weli.wormhole.rpc.user.center.model.LoginCommonParam], arguments=[LoginWithThirdRequest(accessToken=54_y3n-YsttdKOklr_X-mbjRgKAKROLDKfx3PV23zAQHmBvmT3zVliFp9mWodHSUEcKgNnJGC4AP_bAOX-2ZSSsfCHfNM5wfjbJDyiQh-sCjFQ, openId=ofrsY6xraPjv9augA9L2kUIB0-Fk, unionId=null, code=null, type=WX, nickName=null, headImgurl=null, sex=null, skipPhoneBind=1), LoginCommonParam(channel=vivo, ip=null, verName=null, platform=null, deviceId=null, imei=null, app=peanut, subApp=peanut, dfId=DUE_59hD2L9BBgukNR3Jv2DjnW00iRtU8z0eRFVFXzU5aEQyTDlCQmd1a05SM0p2MkRqblcwMGlSdFU4ejBlc2h1, idfa=null, mac=null, oaid=null, androidId=null)], attachments={path=weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService, interface=weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService, version=0.0.0, timeout=5000}]], channel: /10.65.1.81:35450 -> /10.65.5.0:11090
at org.apache.dubbo.rpc.protocol.AsyncToSyncInvoker.invoke(AsyncToSyncInvoker.java:63)
at org.apache.dubbo.rpc.listener.ListenerInvokerWrapper.invoke(ListenerInvokerWrapper.java:78)
at com.alibaba.dubbo.rpc.Invoker$CompatibleInvoker.invoke(Invoker.java:55)
at suishen.esb.hystrix.dubbo.DubboHystrixCommand.run(DubboHystrixCommand.java:84)
... 30 more
Caused by: java.util.concurrent.ExecutionException: org.apache.dubbo.remoting.TimeoutException: Waiting server-side response timeout by scan timer. start time: 2022-03-15 20:33:31.937, end time: 2022-03-15 20:33:36.968, client elapsed: 0 ms, server elapsed: 5031 ms, timeout: 5000 ms, request: Request [id=690506, version=2.0.2, twoway=true, event=false, broken=false, data=RpcInvocation [methodName=loginWithThird, parameterTypes=[class weli.wormhole.rpc.user.center.model.LoginWithThirdRequest, class weli.wormhole.rpc.user.center.model.LoginCommonParam], arguments=[LoginWithThirdRequest(accessToken=54_y3n-YsttdKOklr_X-mbjRgKAKROLDKfx3PV23zAQHmBvmT3zVliFp9mWodHSUEcKgNnJGC4AP_bAOX-2ZSSsfCHfNM5wfjbJDyiQh-sCjFQ, openId=ofrsY6xraPjv9augA9L2kUIB0-Fk, unionId=null, code=null, type=WX, nickName=null, headImgurl=null, sex=null, skipPhoneBind=1), LoginCommonParam(channel=vivo, ip=null, verName=null, platform=null, deviceId=null, imei=null, app=peanut, subApp=peanut, dfId=DUE_59hD2L9BBgukNR3Jv2DjnW00iRtU8z0eRFVFXzU5aEQyTDlCQmd1a05SM0p2MkRqblcwMGlSdFU4ejBlc2h1, idfa=null, mac=null, oaid=null, androidId=null)], attachments={path=weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService, interface=weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService, version=0.0.0, timeout=5000}]], channel: /10.65.1.81:35450 -> /10.65.5.0:11090
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
at org.apache.dubbo.rpc.protocol.AsyncToSyncInvoker.invoke(AsyncToSyncInvoker.java:56)
... 33 more
Caused by: org.apache.dubbo.remoting.TimeoutException: Waiting server-side response timeout by scan timer. start time: 2022-03-15 20:33:31.937, end time: 2022-03-15 20:33:36.968, client elapsed: 0 ms, server elapsed: 5031 ms, timeout: 5000 ms, request: Request [id=690506, version=2.0.2, twoway=true, event=false, broken=false, data=RpcInvocation [methodName=loginWithThird, parameterTypes=[class weli.wormhole.rpc.user.center.model.LoginWithThirdRequest, class weli.wormhole.rpc.user.center.model.LoginCommonParam], arguments=[LoginWithThirdRequest(accessToken=54_y3n-YsttdKOklr_X-mbjRgKAKROLDKfx3PV23zAQHmBvmT3zVliFp9mWodHSUEcKgNnJGC4AP_bAOX-2ZSSsfCHfNM5wfjbJDyiQh-sCjFQ, openId=ofrsY6xraPjv9augA9L2kUIB0-Fk, unionId=null, code=null, type=WX, nickName=null, headImgurl=null, sex=null, skipPhoneBind=1), LoginCommonParam(channel=vivo, ip=null, verName=null, platform=null, deviceId=null, imei=null, app=peanut, subApp=peanut, dfId=DUE_59hD2L9BBgukNR3Jv2DjnW00iRtU8z0eRFVFXzU5aEQyTDlCQmd1a05SM0p2MkRqblcwMGlSdFU4ejBlc2h1, idfa=null, mac=null, oaid=null, androidId=null)], attachments={path=weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService, interface=weli.wormhole.rpc.user.center.api.ILoginNeedPhoneService, version=0.0.0, timeout=5000}]], channel: /10.65.1.81:35450 -> /10.65.5.0:11090
at org.apache.dubbo.remoting.exchange.support.DefaultFuture.doReceived(DefaultFuture.java:189)
at org.apache.dubbo.remoting.exchange.support.DefaultFuture.received(DefaultFuture.java:153)
at org.apache.dubbo.remoting.exchange.support.DefaultFuture$TimeoutCheckTask.run(DefaultFuture.java:252)
at org.apache.dubbo.common.timer.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:648)
at org.apache.dubbo.common.timer.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:727)
at org.apache.dubbo.common.timer.HashedWheelTimer$Worker.run(HashedWheelTimer.java:449)
... 1 more
2、排查處理
a、查詢用戶中心cat監(jiān)控,確認(rèn)用戶中心是否收到請求且正常響應(yīng)。
當(dāng)前狀態(tài):

正常狀態(tài):

對比結(jié)果:
用戶中心已收到正常請求,排除客戶端調(diào)用問題;
用戶中心當(dāng)前響應(yīng)時間異常高出平常的響應(yīng)時間,判斷為用戶中心內(nèi)部業(yè)務(wù)處理出現(xiàn)問題。
b、查看kibana日志,發(fā)現(xiàn)無異常日志;
c、使用Arthas診斷工具,監(jiān)控業(yè)務(wù)調(diào)用鏈各部分耗時情況;

定位原因?yàn)閙ongo查詢慢導(dǎo)致整體業(yè)務(wù)處理超時。
d、分析mongo查詢語句,查看索引情況
public User getUserByDfId(String dfId, String app) {
Criteria criteria = Criteria.where("userBindInfo.dfId").is(dfId).and("app").is(app).and("status").is(1);
return this.mongoTemplate.findOne(Query.query(criteria), User.class);
}
db.getCollection('wormhole_user').getIndexes();
[
{
"v" : 1,
"key" : {
"app" : 1
},
"name" : "app",
"ns" : "wormhole.wormhole_user",
"background" : true
},
{
"v" : 1,
"key" : {
"userBindInfo.dfId" : 1
},
"name" : "userBindInfo.dfId",
"ns" : "wormhole.wormhole_user",
"background" : true
},
... (其余不關(guān)注索引已刪除)
]
“userBindInfo.dfId”字段存在索引,查詢語句也無明顯異常。
e、查看華為云后臺mongo慢日志
{
"op": "query",
"ns": "wormhole.wormhole_user",
"command": {
"find": "wormhole_user",
"filter": {
"userBindInfo.dfId": "DUQmN3QLpLRwC8PNe1joT_9SDRrzazT4cUc5RFVRbU4zUUxwTFJ3QzhQTmUxam9UXzlTRFJyemF6VDRjVWM1c2h1",
"app": "maybe",
"status": 1
},
"ntoreturn": -1
},
"keysExamined": 1870039,
"docsExamined": 1870039,
"cursorExhausted": true,
"numYield": 14836,
"nreturned": 0,
"locks": {
"Global": {
"acquireCount": {
"r": 14837
}
},
"Database": {
"acquireCount": {
"r": 14837
}
},
"Collection": {
"acquireCount": {
"r": 14837
}
}
},
"responseLength": 36,
"millis": 14545,
"planSummary": "IXSCAN { app: 1 }",
"execStats": {
"stage": "CACHED_PLAN",
"nReturned": 0,
"executionTimeMillisEstimate": 14552,
"works": 1,
"advanced": 0,
"needTime": 0,
"needYield": 0,
"saveState": 14836,
"restoreState": 14836,
"isEOF": 1,
"invalidates": 0,
"inputStage": {
"stage": "LIMIT",
"nReturned": 0,
"executionTimeMillisEstimate": 14145,
"works": 1870040,
"advanced": 0,
"needTime": 1870039,
"needYield": 0,
"saveState": 14836,
"restoreState": 14836,
"isEOF": 1,
"invalidates": 0,
"limitAmount": 1,
"inputStage": {
"stage": "FETCH",
"filter": {
"$and": [
{
"status": {
"$eq": 1
}
},
{
"userBindInfo.dfId": {
"$eq": "DUQmN3QLpLRwC8PNe1joT_9SDRrzazT4cUc5RFVRbU4zUUxwTFJ3QzhQTmUxam9UXzlTRFJyemF6VDRjVWM1c2h1"
}
}
]
},
"nReturned": 0,
"executionTimeMillisEstimate": 14117,
"works": 1870040,
"advanced": 0,
"needTime": 1870039,
"needYield": 0,
"saveState": 14836,
"restoreState": 14836,
"isEOF": 1,
"invalidates": 0,
"docsExamined": 1870039,
"alreadyHasObj": 0,
"inputStage": {
"stage": "IXSCAN",
"nReturned": 1870039,
"executionTimeMillisEstimate": 931,
"works": 1870040,
"advanced": 1870039,
"needTime": 0,
"needYield": 0,
"saveState": 14836,
"restoreState": 14836,
"isEOF": 1,
"invalidates": 0,
"keyPattern": {
"app": 1
},
"indexName": "app",
"isMultiKey": false,
"multiKeyPaths": {
"app": []
},
"isUnique": false,
"isSparse": false,
"isPartial": false,
"indexVersion": 1,
"direction": "forward",
"indexBounds": {
"app": [
"["maybe", "maybe"]"
]
},
"keysExamined": 1870039,
"seeks": 1,
"dupsTested": 0,
"dupsDropped": 0,
"seenInvalidated": 0
}
}
}
},
"ts": {
"$date": 1647359086553
},
"client": "10.65.5.0",
"allUsers": [
{
"user": "mongosiud",
"db": "wormhole"
}
],
"user": "mongosiud@wormhole"
}
通過慢日志發(fā)現(xiàn),mongo并未使用“userBindInfo.dfId”作為索引查詢條件,而是使用了“app”作為索引查詢條件。
初步認(rèn)為索引匹配度不夠,mongo沒正確使用索引。
f、新增聯(lián)合索引,提高索引匹配度,查詢時間恢復(fù)正常,接口恢復(fù)正常。
db.getCollection("wormhole_user").createIndex({
"userBindInfo.dfId": 1,
"app": 1,
"status": 1
}, {
name: "idx_user_app_userBindInfo.dfId",
background: true
});
3、問題分析
a、explain()語句分析
db.getCollection('wormhole_user').find({"userBindInfo.dfId": "DUqTmV8lQcInjzWtuyJuAntkbRi5_YZ3yHd9RFVxVG1WOGxRY0luanpXdHV5SnVBbnRrYlJpNV9ZWjN5SGQ5c2h1", "app": "maybe", "status": 1}).explain();
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "wormhole.wormhole_user",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"app" : {
"$eq" : "maybe"
}
},
{
"status" : {
"$eq" : 1.0
}
},
{
"userBindInfo.dfId" : {
"$eq" : "DUqTmV8lQcInjzWtuyJuAntkbRi5_YZ3yHd9RFVxVG1WOGxRY0luanpXdHV5SnVBbnRrYlJpNV9ZWjN5SGQ5c2h1"
}
}
]
},
"winningPlan" : {
"stage" : "FETCH",
"filter" : {
"$and" : [
{
"app" : {
"$eq" : "maybe"
}
},
{
"status" : {
"$eq" : 1.0
}
}
]
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userBindInfo.dfId" : 1
},
"indexName" : "userBindInfo.dfId",
"isMultiKey" : false,
"multiKeyPaths" : {
"userBindInfo.dfId" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"userBindInfo.dfId" : [
"[\"DUqTmV8lQcInjzWtuyJuAntkbRi5_YZ3yHd9RFVxVG1WOGxRY0luanpXdHV5SnVBbnRrYlJpNV9ZWjN5SGQ5c2h1\", \"DUqTmV8lQcInjzWtuyJuAntkbRi5_YZ3yHd9RFVxVG1WOGxRY0luanpXdHV5SnVBbnRrYlJpNV9ZWjN5SGQ5c2h1\"]"
]
}
}
},
"rejectedPlans" : [
{
"stage" : "FETCH",
"filter" : {
"$and" : [
{
"status" : {
"$eq" : 1.0
}
},
{
"userBindInfo.dfId" : {
"$eq" : "DUqTmV8lQcInjzWtuyJuAntkbRi5_YZ3yHd9RFVxVG1WOGxRY0luanpXdHV5SnVBbnRrYlJpNV9ZWjN5SGQ5c2h1"
}
}
]
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"app" : 1
},
"indexName" : "app",
"isMultiKey" : false,
"multiKeyPaths" : {
"app" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"app" : [
"[\"maybe\", \"maybe\"]"
]
}
}
},
{
"stage" : "FETCH",
"filter" : {
"$and" : [
{
"app" : {
"$eq" : "maybe"
}
},
{
"userBindInfo.dfId" : {
"$eq" : "DUqTmV8lQcInjzWtuyJuAntkbRi5_YZ3yHd9RFVxVG1WOGxRY0luanpXdHV5SnVBbnRrYlJpNV9ZWjN5SGQ5c2h1"
}
},
{
"status" : {
"$eq" : 1.0
}
}
]
},
"inputStage" : {
"stage" : "AND_SORTED",
"inputStages" : [
{
"stage" : "IXSCAN",
"keyPattern" : {
"app" : 1
},
"indexName" : "app",
"isMultiKey" : false,
"multiKeyPaths" : {
"app" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"app" : [
"[\"maybe\", \"maybe\"]"
]
}
},
{
"stage" : "IXSCAN",
"keyPattern" : {
"userBindInfo.dfId" : 1
},
"indexName" : "userBindInfo.dfId",
"isMultiKey" : false,
"multiKeyPaths" : {
"userBindInfo.dfId" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"userBindInfo.dfId" : [
"[\"DUqTmV8lQcInjzWtuyJuAntkbRi5_YZ3yHd9RFVxVG1WOGxRY0luanpXdHV5SnVBbnRrYlJpNV9ZWjN5SGQ5c2h1\", \"DUqTmV8lQcInjzWtuyJuAntkbRi5_YZ3yHd9RFVxVG1WOGxRY0luanpXdHV5SnVBbnRrYlJpNV9ZWjN5SGQ5c2h1\"]"
]
}
}
]
}
}
]
},
"serverInfo" : {
"host" : "host-192-168-10-163",
"port" : 27017,
"version" : "4.0.3",
"gitVersion" : "0ead8608f9d151a199b05117bcc79ccb8d5f44a0"
},
"ok" : 1.0,
"operationTime" : Timestamp(1647486920, 2),
"$clusterTime" : {
"clusterTime" : Timestamp(1647486920, 2),
"signature" : {
"hash" : { "$binary" : "ylTAtLK6mVpNwDt7PTGwNST/9z4=", "$type" : "00" },
"keyId" : NumberLong(7031512438860152835)
}
}
}
- 發(fā)現(xiàn)mongo執(zhí)行計(jì)劃選擇正確,這正好解釋了服務(wù)在之前很長的一段時間內(nèi)都是正常的原因;
- mongo慢日志中的執(zhí)行計(jì)劃卻選擇錯誤,表明有什么原因?qū)е铝薽ongo執(zhí)行計(jì)劃的選擇發(fā)生了變更。
b、文檔查詢
- MongoDB 查詢優(yōu)化器會緩存最有效的查詢計(jì)劃,關(guān)聯(lián)的計(jì)劃緩存條目會用于具有相同查詢形狀的后續(xù)查詢;
- 計(jì)劃緩存會出現(xiàn)刷新:
- MongoDB重啟;
- 索引或者集合的刪除添加更新等操作;
- 最近最少使用 (LRU) 緩存替換機(jī)制清除最近最少訪問的緩存條目。
結(jié)論:之前使用了計(jì)劃緩存服務(wù)正常,到了某一時刻,計(jì)劃緩存失效,MongoDB重新選擇計(jì)劃,此時選擇了錯誤的計(jì)劃并緩存,導(dǎo)致后續(xù)查詢?nèi)砍霈F(xiàn)問題
c、查詢首次出現(xiàn)的慢日志
{
"op":"query",
"ns":"wormhole.wormhole_user",
"command":{
"find":"wormhole_user",
"filter":{
"userBindInfo.dfId":null,
"app":"cybercat",
"status":1
},
"ntoreturn":-1
},
"keysExamined":872550,
"docsExamined":872550,
"fromMultiPlanner":true,
"replanned":true,
"cursorExhausted":true,
"numYield":13785,
"nreturned":0,
"locks":{
"Global":{
"acquireCount":{
"r":13786
}
},
"Database":{
"acquireCount":{
"r":13786
}
},
"Collection":{
"acquireCount":{
"r":13786
}
}
},
"responseLength":36,
"millis":27864,
"planSummary":"IXSCAN { app: 1 }",
"execStats":{
"stage":"LIMIT",
"nReturned":0,
"executionTimeMillisEstimate":26216,
"works":872551,
"advanced":0,
"needTime":872550,
"needYield":0,
"saveState":13785,
"restoreState":13785,
"isEOF":1,
"invalidates":0,
"limitAmount":1,
"inputStage":{
"stage":"FETCH",
"filter":{
"$and":[
{
"status":{
"$eq":1
}
},
{
"userBindInfo.dfId":{
"$eq":null
}
}
]
},
"nReturned":0,
"executionTimeMillisEstimate":26164,
"works":872551,
"advanced":0,
"needTime":872550,
"needYield":0,
"saveState":13785,
"restoreState":13785,
"isEOF":1,
"invalidates":0,
"docsExamined":872550,
"alreadyHasObj":0,
"inputStage":{
"stage":"IXSCAN",
"nReturned":872550,
"executionTimeMillisEstimate":249,
"works":872551,
"advanced":872550,
"needTime":0,
"needYield":0,
"saveState":13785,
"restoreState":13785,
"isEOF":1,
"invalidates":0,
"keyPattern":{
"app":1
},
"indexName":"app",
"isMultiKey":false,
"multiKeyPaths":{
"app":[
]
},
"isUnique":false,
"isSparse":false,
"isPartial":false,
"indexVersion":1,
"direction":"forward",
"indexBounds":{
"app":[
"["cybercat", "cybercat"]"
]
},
"keysExamined":872550,
"seeks":1,
"dupsTested":0,
"dupsDropped":0,
"seenInvalidated":0
}
}
},
"ts":{
"$date":1647309521700
},
"client":"10.65.1.25",
"allUsers":[
{
"user":"mongosiud",
"db":"wormhole"
}
],
"user":"mongosiud@wormhole"
}
發(fā)現(xiàn)查詢條件中出現(xiàn)了null值,本地再次分析:
db.getCollection('wormhole_user').find({"userBindInfo.dfId": null, "app": "maybe", "status": 1}).explain();
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "wormhole.wormhole_user",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"app" : {
"$eq" : "maybe"
}
},
{
"status" : {
"$eq" : 1.0
}
},
{
"userBindInfo.dfId" : {
"$eq" : null
}
}
]
},
"winningPlan" : {
"stage" : "FETCH",
"filter" : {
"$and" : [
{
"status" : {
"$eq" : 1.0
}
},
{
"userBindInfo.dfId" : {
"$eq" : null
}
}
]
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"app" : 1
},
"indexName" : "app",
"isMultiKey" : false,
"multiKeyPaths" : {
"app" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"app" : [
"[\"maybe\", \"maybe\"]"
]
}
}
},
"rejectedPlans" : [
{
"stage" : "FETCH",
"filter" : {
"$and" : [
{
"userBindInfo.dfId" : {
"$eq" : null
}
},
{
"app" : {
"$eq" : "maybe"
}
},
{
"status" : {
"$eq" : 1.0
}
}
]
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userBindInfo.dfId" : 1
},
"indexName" : "userBindInfo.dfId",
"isMultiKey" : false,
"multiKeyPaths" : {
"userBindInfo.dfId" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"userBindInfo.dfId" : [
"[undefined, undefined]",
"[null, null]"
]
}
}
}
]
},
"serverInfo" : {
"host" : "host-192-168-10-163",
"port" : 27017,
"version" : "4.0.3",
"gitVersion" : "0ead8608f9d151a199b05117bcc79ccb8d5f44a0"
},
"ok" : 1.0,
"operationTime" : Timestamp(1647489666, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1647489666, 1),
"signature" : {
"hash" : { "$binary" : "1KJI3aoz2QbOwTKlbkNl9SmWLzw=", "$type" : "00" },
"keyId" : NumberLong(7031512438860152835)
}
}
}
結(jié)果與線上表現(xiàn)一致
4、回顧
緊急處理時,直接創(chuàng)建了一個新的索引,導(dǎo)致了計(jì)劃緩存失效,重新選擇計(jì)劃并緩存,所以服務(wù)恢復(fù)了正常。
三、總結(jié)
- 避免空值的查詢;
- 盡可能的使用explain()分析各種不同的查詢情況;
- 可以使用mongo提供的PlanCache相關(guān)功能,查看計(jì)劃緩存情況;
- 可使用hint()推薦查詢索引。
到此這篇關(guān)于一次線上mongo慢查詢問題排查處理的文章就介紹到這了,更多相關(guān)mongo慢查詢問題排查內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
相關(guān)文章
MongoDB數(shù)據(jù)庫兩階段提交實(shí)現(xiàn)事務(wù)的方法詳解
這篇文章主要介紹了MongoDB數(shù)據(jù)庫兩階段提交實(shí)現(xiàn)事務(wù)的方法,結(jié)合實(shí)例形式詳細(xì)分析了MongoDB數(shù)據(jù)庫事務(wù)提交、回滾、撤銷等操作的原理、實(shí)現(xiàn)方法及相關(guān)操作注意事項(xiàng),需要的朋友可以參考下2018-08-08
基于Mongodb分布式鎖解決定時任務(wù)并發(fā)執(zhí)行問題
這篇文章主要介紹了基于Mongodb分布式鎖解決定時任務(wù)并發(fā)執(zhí)行問題,網(wǎng)上有很多分布式鎖的實(shí)現(xiàn)方案,基于redis、zk、等有很多,但是我的就是一個用了mysql和mongo的小應(yīng)用,本文給大家詳細(xì)講解,需要的朋友可以參考下2023-04-04
MongoDB添加仲裁節(jié)點(diǎn)報(bào)錯:replica set IDs do not match的解決方法
這篇文章主要給大家介紹了關(guān)于MongoDB添加仲裁節(jié)點(diǎn)報(bào)錯:replica set IDs do not match的解決方法,文中通過示例代碼介紹的非常詳細(xì),需要的朋友可以參考借鑒,下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧2018-11-11
Windows系統(tǒng)啟動MongoDB報(bào)錯無法連接服務(wù)器的問題及解決方案
在Windows系統(tǒng)中啟動MongoDB時遇到連接拒絕的錯誤,通常是因?yàn)榉?wù)未運(yùn)行或配置問題,本文給大家分享Windows系統(tǒng)啟動MongoDB報(bào)錯無法連接服務(wù)器的問題及解決方案,一起看看吧2024-10-10
Windows系統(tǒng)安裝運(yùn)行Mongodb服務(wù)
今天小編就為大家分享一篇關(guān)于Windows系統(tǒng)安裝運(yùn)行Mongodb服務(wù),小編覺得內(nèi)容挺不錯的,現(xiàn)在分享給大家,具有很好的參考價值,需要的朋友一起跟隨小編來看看吧2018-10-10
MongoDB的基本特性與內(nèi)部構(gòu)造的講解
今天小編就為大家分享一篇關(guān)于MongoDB的基本特性與內(nèi)部構(gòu)造的講解,小編覺得內(nèi)容挺不錯的,現(xiàn)在分享給大家,具有很好的參考價值,需要的朋友一起跟隨小編來看看吧2019-03-03

