Spark Worker 在 Heartbeater 中与 Spark Driver 通信的超时时间为 3600 秒

2024-05-05

我没有配置任何超时值,而是使用默认设置。 在哪里配置3600秒超时?怎么解决呢?

错误信息:

18/01/10 13:51:44 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [3600 seconds]. This timeout is controlled by spark.executor.heartbeatInterval
    at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:47)
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:62)
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:58)
    at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
    at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)
    at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:738)
    at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply$mcV$sp(Executor.scala:767)
    at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply(Executor.scala:767)
    at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply(Executor.scala:767)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1948)
    at org.apache.spark.executor.Executor$$anon$2.run(Executor.scala:767)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [3600 seconds]
    at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
    at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
    at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
    ... 14 more

错误消息中写道:

这个超时时间由spark.executor.heartbeatInterval控制

因此,您尝试的第一件事就是增加该值。可以通过多种方式完成,例如将值增加到 10000 秒:

  • 使用时spark-submit只需添加标志:

    --conf spark.executor.heartbeatInterval=10000s
    
  • 您可以在spark-defaults.conf中添加一行:

    spark.executor.heartbeatInterval 10000s
    
  • 当创建一个新的SparkSession在你的程序中,添加一个配置参数(Scala):

    val spark = SparkSession.builder
      .config("spark.executor.heartbeatInterval", "10000s")
      .getOrCreate()
    

如果这没有帮助,尝试增加以下值可能是个好主意spark.network.timeout以及。这也是与这些类型的超时相关的问题的常见根源。

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

Spark Worker 在 Heartbeater 中与 Spark Driver 通信的超时时间为 3600 秒 的相关文章

随机推荐