β

Druid historical/broker 节点启动失败

致一 5 阅读

部署Druid服务时遇到了启动失败的异常。相关的节点是historical和broker。异常信息如下:

-07-12T09:01:30,523 ERROR [main] io.druid.cli.CliHistorical - Error when starting up.  Failing.
com.google.inject.ProvisionException: Unable to provision, see the following errors:
) Not enough direct memory.  Please adjust -XX:MaxDirectMemorySize, druid.processing.buffer.sizeBytes, druid.processing.numThreads, or druid.processing.numMergeBuffers: maxDirectMemory[4,294,967,296], memoryNeeded[5,368,709,120] = druid.processing.buffer.sizeBytes[536,870,912] * (druid.processing.numMergeBuffers[2] + druid.processing.numThreads[7] + 1)
  at io.druid.guice.DruidProcessingModule.getIntermediateResultsPool(DruidProcessingModule.java:110) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> io.druid.guice.DruidProcessingModule)
  at io.druid.guice.DruidProcessingModule.getIntermediateResultsPool(DruidProcessingModule.java:110) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> io.druid.guice.DruidProcessingModule)
  while locating io.druid.collections.NonBlockingPool<java.nio.ByteBuffer> annotated with @io.druid.guice.annotations.Global()
    for the 2nd parameter of io.druid.query.groupby.GroupByQueryEngine.<init>(GroupByQueryEngine.java:81)
  at io.druid.guice.QueryRunnerFactoryModule.configure(QueryRunnerFactoryModule.java:88) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> io.druid.guice.QueryRunnerFactoryModule)
  while locating io.druid.query.groupby.GroupByQueryEngine
    for the 2nd parameter of io.druid.query.groupby.strategy.GroupByStrategyV1.<init>(GroupByStrategyV1.java:77)
  while locating io.druid.query.groupby.strategy.GroupByStrategyV1
    for the 2nd parameter of io.druid.query.groupby.strategy.GroupByStrategySelector.<init>(GroupByStrategySelector.java:43)
  while locating io.druid.query.groupby.strategy.GroupByStrategySelector
    for the 1st parameter of io.druid.query.groupby.GroupByQueryQueryToolChest.<init>(GroupByQueryQueryToolChest.java:104)
  at io.druid.guice.QueryToolChestModule.configure(QueryToolChestModule.java:101) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> io.druid.guice.QueryRunnerFactoryModule)
  while locating io.druid.query.groupby.GroupByQueryQueryToolChest
  while locating io.druid.query.QueryToolChest annotated with @com.google.inject.multibindings.Element(setName=,uniqueId=80, type=MAPBINDER, keyType=java.lang.Class<? extends io.druid.query.Query>)
  at io.druid.guice.DruidBinders.queryToolChestBinder(DruidBinders.java:45) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> io.druid.guice.QueryRunnerFactoryModule -> com.google.inject.multibindings.MapBinder$RealMapBinder)
  while locating java.util.Map<java.lang.Class<? extends io.druid.query.Query>, io.druid.query.QueryToolChest>
    for the 1st parameter of io.druid.query.MapQueryToolChestWarehouse.<init>(MapQueryToolChestWarehouse.java:36)
  while locating io.druid.query.MapQueryToolChestWarehouse
  while locating io.druid.query.QueryToolChestWarehouse
    for the 1st parameter of io.druid.server.QueryLifecycleFactory.<init>(QueryLifecycleFactory.java:52)
  at io.druid.server.QueryLifecycleFactory.class(QueryLifecycleFactory.java:52)
  while locating io.druid.server.QueryLifecycleFactory
    for the 1st parameter of io.druid.server.QueryResource.<init>(QueryResource.java:113)
  at io.druid.server.QueryResource.class(QueryResource.java:78)
  while locating io.druid.server.QueryResource


......
 errors
	at com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1028) ~[guice-4.1.0.jar:?]
	at com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1050) ~[guice-4.1.0.jar:?]
	at io.druid.guice.LifecycleModule$2.start(LifecycleModule.java:132) ~[druid-api-0.12.1.jar:0.12.1]
	at io.druid.cli.GuiceRunnable.initLifecycle(GuiceRunnable.java:101) [druid-services-0.12.1.jar:0.12.1]
	at io.druid.cli.ServerRunnable.run(ServerRunnable.java:50) [druid-services-0.12.1.jar:0.12.1]
	at io.druid.cli.Main.main(Main.java:116) [druid-services-0.12.1.jar:0.12.1]

异常信息中比较关键的部分是下面这一句:

Not enough direct memory.  Please adjust -XX:MaxDirectMemorySize, druid.processing.buffer.sizeBytes, druid.processing.numThreads, or druid.processing.numMergeBuffers: maxDirectMemory[4,294,967,296], memoryNeeded[5,368,709,120] = druid.processing.buffer.sizeBytes[536,870,912] * (druid.processing.numMergeBuffers[2] + druid.processing.numThreads[7] + 1)

这一句指明了启动失败的原因:分配的直接内存不足。需要的直接内存大小是“5,368,709,120”,实际提供的大小是“4,294,967,296”。

并且指明了解决方案:在虚拟机参数中添加“-XX:MaxDirectMemorySize”这样一个指标。

以为这样就够了!?druid的开发人员做的实际上还要多一点,他(们)在这段异常信息中还解释了为什么需要这么多直接内存,看看异常信息中提供的计算公式:

memoryNeeded[5,368,709,120] 
	= druid.processing.buffer.sizeBytes[536,870,912] 
		* (druid.processing.numMergeBuffers[2] + druid.processing.numThreads[7] + 1)

为了看着方便,我手动做了下换行。

公式中使用到的配置参数如druid.processing.buffer.sizeBytes和druid.processing.numMergeBuffers等可以在historical和middleManager的runtime.properties文件中找到。方括号里的值是用户自己设置的值。

至于怎么修改这个问题,网上找到的资料多是建议在middleManager的runtime.properties的druid.indexer.runner.javaOpts配置项中进行配置,我试了一下,不中。

忘了说了,使用的druid版本是0.12.1。

成功了的做法是分别在conf/druid/historical/jvm.config和conf/druid/broker/jvm.config中添加“-XX:MaxDirectMemorySize”参数。参数值可以参考异常信息的提示酌情设置。这里是我的historical/jvm.config的配置:

-server
-Xms8g
-Xmx8g
-XX:MaxDirectMemorySize=8G
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

设置完成后,重启,可以成功,说明配置生效。就这样。

######

作者:致一
预则立,不预则废
原文地址:Druid historical/broker 节点启动失败, 感谢原作者分享。

发表评论