为什么番石榴在我的 build.sbt 中没有正确着色?

2023-12-24

tl;dr: Here https://github.com/erip/shading-repro-lagom-hdfs是包含问题的存储库。


Cassandra 和 HDFS 都在内部使用 guava,但由于各种原因,它们都没有屏蔽依赖关系。因为番石榴的版本不兼容二进制,我发现NoSuchMethodErrors 在运行时。

我试过自己在我的房间里遮荫番石榴build.sbt:

val HadoopVersion =  "2.6.0-cdh5.11.0"

// ...

val hadoopHdfs = "org.apache.hadoop" % "hadoop-hdfs" % HadoopVersion
val hadoopCommon = "org.apache.hadoop" % "hadoop-common" % HadoopVersion
val hadoopHdfsTest = "org.apache.hadoop" % "hadoop-hdfs" % HadoopVersion % "test" classifier "tests"
val hadoopCommonTest = "org.apache.hadoop" % "hadoop-common" % HadoopVersion % "test" classifier "tests"
val hadoopMiniDFSCluster = "org.apache.hadoop" % "hadoop-minicluster" % HadoopVersion % Test

// ...

assemblyShadeRules in assembly := Seq(
  ShadeRule.rename("com.google.common.**" -> "shade.com.google.common.@1").inLibrary(hadoopHdfs).inProject,
  ShadeRule.rename("com.google.common.**" -> "shade.com.google.common.@1").inLibrary(hadoopCommon).inProject,
  ShadeRule.rename("com.google.common.**" -> "shade.com.google.common.@1").inLibrary(hadoopHdfsTest).inProject,
  ShadeRule.rename("com.google.common.**" -> "shade.com.google.common.@1").inLibrary(hadoopCommonTest).inProject,
  ShadeRule.rename("com.google.common.**" -> "shade.com.google.common.@1").inLibrary(hadoopMiniDFSCluster).inProject
)

assemblyJarName in assembly := s"${name.value}-${version.value}.jar"

assemblyMergeStrategy in assembly := {
  case PathList("META-INF", "MANIFEST.MF") => MergeStrategy.discard
  case _ => MergeStrategy.first
}

但运行时异常仍然存在(哈——这是一个卡桑德拉笑话,朋友们)。

具体的例外是

[info] HdfsEntitySpec *** ABORTED ***
[info]   java.lang.NoSuchMethodError: com.google.common.base.Objects.toStringHelper(Ljava/lang/Object;)Lcom/google/common/base/Objects$ToStringHelper;
[info]   at org.apache.hadoop.metrics2.lib.MetricsRegistry.toString(MetricsRegistry.java:406)
[info]   at java.lang.String.valueOf(String.java:2994)
[info]   at java.lang.StringBuilder.append(StringBuilder.java:131)
[info]   at org.apache.hadoop.ipc.metrics.RetryCacheMetrics.<init>(RetryCacheMetrics.java:46)
[info]   at org.apache.hadoop.ipc.metrics.RetryCacheMetrics.create(RetryCacheMetrics.java:53)
[info]   at org.apache.hadoop.ipc.RetryCache.<init>(RetryCache.java:202)
[info]   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initRetryCache(FSNamesystem.java:1038)
[info]   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:949)
[info]   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:796)
[info]   at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1040)
[info]   ...

如何正确遮蔽番石榴以阻止运行时错误?


着色规则仅在您构建胖罐子时适用。它不会在其他 sbt 任务中应用。

如果您想对 hadoop 依赖项内部的某个库进行遮蔽,您可以创建一个仅包含 hadoop 依赖项的新项目,对库进行遮蔽,然后发布一个包含所有遮蔽的 hadoop 依赖项的 fat jar。

这不是一个完美的解决方案,因为新的 hadoop jar 中的所有依赖项对于谁使用它们来说都是“未知的”,并且您需要手动处理冲突。

这是您在您的应用程序中需要的代码build.sbt发布一个胖hadoop jar (使用你的代码和 sbt 程序集docs https://github.com/sbt/sbt-assembly#publishing-not-recommended):

val HadoopVersion =  "2.6.0-cdh5.11.0"

val hadoopHdfs = "org.apache.hadoop" % "hadoop-hdfs" % HadoopVersion
val hadoopCommon = "org.apache.hadoop" % "hadoop-common" % HadoopVersion
val hadoopHdfsTest = "org.apache.hadoop" % "hadoop-hdfs" % HadoopVersion classifier "tests"
val hadoopCommonTest = "org.apache.hadoop" % "hadoop-common" % HadoopVersion %  classifier "tests"
val hadoopMiniDFSCluster = "org.apache.hadoop" % "hadoop-minicluster" % HadoopVersion 

lazy val fatJar = project
  .enablePlugins(AssemblyPlugin)
  .settings(
    libraryDependencies ++= Seq(
        hadoopHdfs,
        hadoopCommon,
        hadoopHdfsTest,
        hadoopCommonTest,
        hadoopMiniDFSCluster
    ),
      assemblyShadeRules in assembly := Seq(
      ShadeRule.rename("com.google.common.**" -> "shade.@0").inAll
    ),
    assemblyMergeStrategy in assembly := {
      case PathList("META-INF", "MANIFEST.MF") => MergeStrategy.discard
      case _ => MergeStrategy.first
    },
    artifact in (Compile, assembly) := {
      val art = (artifact in (Compile, assembly)).value
      art.withClassifier(Some("assembly"))
    },
    addArtifact(artifact in (Compile, assembly), assembly),
    crossPaths := false, // Do not append Scala versions to the generated artifacts
    autoScalaLibrary := false, // This forbids including Scala related libraries into the dependency
    skip in publish := true
  )

lazy val shaded_hadoop = project
  .settings(
    name := "shaded-hadoop",
    packageBin in Compile := (assembly in (fatJar, Compile)).value
  )

我还没有测试过,但这就是要点。


我想指出我注意到的另一个问题,您的合并策略可能会给您带来问题,因为您想对某些文件应用不同的策略。查看默认策略here https://github.com/sbt/sbt-assembly#merge-strategy.
我建议使用类似的方法来保留所有不适用的内容的原始策略deduplicate

assemblyMergeStrategy in assembly := {
          entry: String => {
            val strategy = (assemblyMergeStrategy in assembly).value(entry)
            if (strategy == MergeStrategy.deduplicate) MergeStrategy.first
            else strategy
          }
      }
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

为什么番石榴在我的 build.sbt 中没有正确着色? 的相关文章

随机推荐