使用数据框时播送 http://spark.apache.org/docs/2.0.0/api/java/org/apache/spark/sql/functions.html#broadcast(org.apache.spark.sql.Dataset)函数或 SparkContext播送 https://spark.apache.org/docs/2.0.0/api/java/org/apache/spark/SparkContext.html#broadcast(T,%20scala.reflect.ClassTag)函数,可以分派给所有执行器的最大对象大小是多少?
从 Spark 2.4 开始,上限为 8 GB。源代码 https://github.com/apache/spark/blob/79c66894296840cc4a5bf6c8718ecfd2b08bcca8/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala#L104
更新 :
8GB 限制对于 Spark 3.2.1 仍然有效源代码 https://github.com/apache/spark/blob/b8d3da16b1bdde60b50e364a4ff98cb6bf8ccd7e/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala#L225
更新:
对 Spark 3.4 仍然有效源代码 https://github.com/apache/spark/blob/e17d8ecabcad6e84428752b977120ff355a4007a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala#L226
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)