在给定容器错误状态代码的情况下,在哪里可以找到更明确的错误?

2024-02-28

我实际上是通过一个运行任务Mesos堆栈,它使用Docker容器。

有时,某些任务会失败。

以下是一些相关的TaskStatus消息和原因:

message: Container exited with status 1 - reason: REASON_COMMAND_EXECUTOR_FAILED
message: Container exited with status 42 - reason: REASON_COMMAND_EXECUTOR_FAILED
message: Container exited with status 137 - reason: REASON_COMMAND_EXECUTOR_FAILED

是否有一个对应表将容器错误状态代码链接到TaskStatus带有更明确错误的消息?


命令任务可能会因多种原因而失败并设置正确的退出代码。例如 Docker 1.10 设置这样的退出状态代码(来自文档 https://docs.docker.com/engine/reference/run/#exit-status and 这个答案 https://stackoverflow.com/a/35410993/1387612):

docker run 的退出代码提供了有关原因的信息 容器无法运行或退出的原因。当 docker run 退出时 使用非零代码时,退出代码遵循 chroot 标准,请参阅 以下:

125如果错误与 Docker 守护程序有关itself:

$ docker run --foo busybox; echo $?
# flag provided but not defined: --foo   See 'docker run --help'.   

126如果无法调用包含的命令:

$ docker run busybox /etc; echo $?
# docker: Error response from daemon: Container command '/etc' could not be invoked.   

127如果找不到包含的命令

$ docker run busybox foo; echo $?
# docker: Error response from daemon: Container command 'foo' not found or does not exist.   127 Exit code of contained command

否则

$ docker run busybox /bin/sh -c 'exit 3'; echo $?
# 3

可以找到另一个退出代码规则here http://tldp.org/LDP/abs/html/exitcodes.html

| Code  |            Meaning             |         Example         |                                                   Comments                                                   |
|-------|--------------------------------|-------------------------|--------------------------------------------------------------------------------------------------------------|
| 1     | Catchall for general errors    | let "var1 = 1/0"        | Miscellaneous errors, such as "divide by zero" and other impermissible operations                            |
| 2     | Misuse of shell builtins       | empty_function() {}     | Missing keyword or command, or permission problem (and diff return code on a failed binary file comparison). |
| 126   | Command invoked cannot execute | /dev/null               | Permission problem or command is not an executable                                                           |
| 127   | "command not found"            | illegal_command         | Possible problem with $PATH or a typo                                                                        |
| 128   | Invalid argument to exit       | exit 3.14159            | exit takes only integer args in the range 0 - 255 (see first footnote)                                       |
| 128+n | Fatal error signal "n"         | kill -9 $PPID of script | $? returns 137 (128 + 9)                                                                                     |
| 130   | Script terminated by Control-C | Ctl-C                   | Control-C is fatal error signal 2, (130 = 128 + 2, see above)                                                |
| 255*  | Exit status out of range       | exit -1                 | exit takes only integer args in the range 0 - 255                                                            |

根据你的例子:

  • 137内存不足 https://en.wikipedia.org/wiki/Out_of_memory; 128 + 9 = 137 (9 coming from SIGKILL) https://github.com/moby/moby/issues/21083#issuecomment-239578836并可能被转码为内存不足错误并终止。
  • 1– 命令退出1。可能是由于配置无效、内部应用程序错误或输入无效。
  • 42

    生命、宇宙和一切终极问题的答案 https://en.wikipedia.org/wiki/Phrases_from_The_Hitchhiker%27s_Guide_to_the_Galaxy#Answer_to_the_Ultimate_Question_of_Life.2C_the_Universe.2C_and_Everything_.2842.29

    块引用>

如果您需要更多信息来解释状态代码,您可以检查Message https://github.com/apache/mesos/blob/1.2.0/include/mesos/mesos.proto#L1825Mesos TaskStatus 更新中的字段,例如 Mesos 放有有关 OOM 的信息。在 Mesos 日志中也可以找到相同的信息。要调试命令返回非零代码的原因,您可以检查存储在执行程序沙箱中的文件,特别是 stderr/stdout 或命令特定日志。

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

在给定容器错误状态代码的情况下,在哪里可以找到更明确的错误? 的相关文章

随机推荐