通过赋值启动子 shell 并等待

2023-12-21

如何通过分配变量来启动一些子shell并等待所有完成？

#!/bin/bash

#some code about $FILE="$1"

cat "$FILE" | while read -r HOST || [[ -n $HOST ]];
do
    echo "$HOST";
    URL="http://$HOST";  QUEST1=$(curl -Is --connect-timeout 200 --max-time 200 "$URL" | head -1);
    P1=$!
    URL="https://$HOST"; QUEST2=$(curl -Is --connect-timeout 200 --max-time 200 "$URL" | head -1);
    P2=$!
    
    echo "$P1 $P2"
    wait $P1 $P2
    R1=$( echo "$QUEST1" | grep -o " 200" );
    R2=$( echo "$QUEST2" | grep -o " 200" );
    echo "$R1 $R2"
    
    if [[ "$R1" || "$R2" ]]; then
    echo "FOUND!";
    fi

done

这是行不通的。echo "$P1 $P2"是空的，因为我在子外壳中。我希望从当代开始，这样我就不必在第一个完成后等待。

好的，这是一个基本问题，但我想了解如何将其应用于其他情况。拜托，我不需要外部文件。

EDIT对于谁不明白。我想把$QUEST1 and $QUEST2在后台加速时间和等待，因为完成后不使用额外的文件。我读了很多，但没有解决任何问题。谢谢

以下是评论的简历：

将子shell输出分配给变量/使用子shell输出（STDOUT）意味着父shell将等待所有子shell子进程的结束，即使有内部有背景的命令。

举个例子：

x=$( { { /bin/sleep 10 ; echo out1; echo out2; } | head -1; } & ); \
echo "Wrong child PID : $!"

这将阻止父 shell 十秒钟。但是这里你得到了父 shell$!，而不是子 shell 中定义的。为了得到预期的$!，您必须以某种方式将其传输到您的父 shell（通过 STDOUT、STDERR 或文件、或命名管道等）。您可以通过 STDOUT 来实现这一点，例如：

subpid=$( { { { /bin/sleep 10 ; echo out1; echo out2; } | head -1; } 1>&2 & } ; \
echo $!)

在这里，当您的子 shell 将其命令输出发送到 STDERR 并仅输出子 PID 时$!在 STDOUT 上，该命令几乎会立即执行（然后父 shell 不会阻塞 I/O）。

正如您期望尽可能避免 I/O，并且如果您只需要子 shell$!要等待子进程，您可以依赖父 shell 会等待的事实所有 STDOUT 输出来自子外壳。那么你的实际命令就足够了，不需要知道子shell$! :

URL="http://$HOST";  QUEST1=$(curl -Is --connect-timeout 200 --max-time 200 "$URL" \
| head -1);

但是，如果您需要知道子 shell 的子 PID（请注意，该 PID 将是here a 外壳进程号，不是其中之一curl or the head命令）和wait为了让 subshell 命令完成，那么你可以做类似的事情来获得近确定性顺序（如果您的子命令不包含至少一个管道，则该输出将不起作用）：

x=$( { spid=$( { { { /bin/sleep 10;echo out1;echo out2; }|head -1;} 1>&2 & };echo $!);} \
2>&1 ; echo "SUBPID=$spid" )

这将屈服x, after 十秒 : SUBPID=<subshell child pid> out1.

此时，这个SUBPID将不再存在（或者不再是“你的”子shell子pid），但你可以记录它或用它做任何你想做的事情。

你的命令将类似于：

URL="http://$HOST";  QUEST1=$( \
{ subpid=$( { { curl -Is --connect-timeout 200 --max-time 200 "$URL" | head -1; \
 } 1>&2 & } ;echo $!); } 2>&1 ; echo "SUBPID=$subpid" );

第一个条目QUEST1应该SUBPID=接下来是curl第一行输出。

为了清楚地表明 shell 会等待，您可以使用 google.com 在内部休眠 10 秒来测试它：

URL="http://www.google.com";  QUEST1=$( { subpid=$( \
{ { { curl -Is --connect-timeout 200 --max-time 200 "$URL"; sleep 10; } | head -1; \
 } 1>&2 & };echo $!); } 2>&1 ; echo "SUBPID=$subpid" );

Update

经过我们的交流，我了解到您正在寻找异步waitable子 shell 中的子进程，您需要在完成时从中获取输出，所有这些不使用临时文件或命名管道.

有一个解决方案不需要临时文件，不需要磁盘写入 I/O，并且基于@hhtamas 创建匿名 fifo 的解决方案 https://superuser.com/questions/184307/bash-create-anonymous-fifo用于匿名管道而不是命名管道。

首先，这是该解决方案的一个简单示例，接下来是针对您的用例的实现（许多curl通过子shell调用）。

解决方案示例：

#!/bin/bash
# We use the bright solution from @htamas to create an anonymous pipe
# in the fds of our current shell.
# see: https://superuser.com/questions/184307/bash-create-anonymous-fifo
#
#
# 1. Creating the anonymous pipe
#

# start a background pipeline with two processes running forever
tail -f /dev/null | tail -f /dev/null &
# save the process ids
PID2=$!
PID1=$(jobs -p %+)
# hijack the pipe's file descriptors using procfs
exec 3>/proc/"${PID1}"/fd/1 4</proc/"${PID2}"/fd/0
# kill the background processes we no longer need
# (using disown suppresses the 'Terminated' message)
disown $PID2
kill "${PID1}" "${PID2}"
# anything we write to fd 3 can be read back from fd 4

#
# 2. Launching an "asynchonous subshell" and get its output
#

# We set a flag to trap the async subshell termination through SIGHUP
ready=0;
trap "ready=1" SIGHUP;

# We launch our subshell for the subprocess "sleep 10" with its output
# connected to the standalone anonymous pipe.
# As the sleep command as no output, we add "starting" and "finish".
# Note that as we send the output elsewhere than STDOUT, it's non blocking
# Note also that we send SIGHUP to our parent shell ($$) when the command finishs.
x=$( { echo "starting"; sleep 10; echo "finish"; echo "EOF"; kill -SIGHUP $$; } >&3 & )

# We now wait that our subshell terminates, it will terminate within the sleep command.
# Will waiting, we can do stuff. Here we just display "Waiting.." every seconds.
while [ "${ready}" = "0" ]; do
   echo "waiting for subshell..";
   sleep 1;
done;

# We close fd 3 early as we should no more output from the subshell
exec 3>&-

# We recover our subshell output from the out point of the autonomous pipe in y
line=""
y=$( while [ "${line}" != "EOF" ] ; do 
      read -r -u 4 line; 
      [ "${line}" != "EOF" ] && echo "${line}"; 
     done );

# And display the output of the subshell
echo "Subshell terminate, its output : ";
echo "${y}"

# close the file descriptors when we are finished (optional)
exec 4<&-

该解决方案需要/proc文件系统，在许多实际的 UNIX 上很常见。解释作为脚本中的注释提供。

小编辑：更好的子 shell 身份进程，等待时更多的进程信息，处理子 shell 的潜在崩溃。

针对您的用例的实施：

#!/bin/bash
#
# Create the anonymous pipe.
# 
# Parameters: None.
# Returns:
#   0 : Success.
#   1 : Failed to launch tails.
#   2 : Failed to exec.
#   3 : Failed to kill tails process.
function CreateAnonymousPipe() {
  # We use the bright solution from @htamas to create an anonymous pipe
  # in the fds of our current shell.
  # see: https://superuser.com/questions/184307/bash-create-anonymous-fifo
  #
  local pid1
  local pid2
  # start a background pipeline with two processes running forever
  tail -f /dev/null | tail -f /dev/null &
  [ $? != 0 ] && return 1;
  # save the process ids
  pid2=$!
  pid1=$(jobs -p %+)
  # hijack the pipe's file descriptors using procfs
  exec 3>/proc/"${pid1}"/fd/1 4</proc/"${pid2}"/fd/0
  [ $? != 0 ] && return 2;
  # kill the background processes we no longer need
  # (using disown suppresses the 'Terminated' message)
  disown "${pid2}"
  kill "${pid1}" "${pid2}"
  [ $? != 0 ] && return 3;

  # anything we write to fd 3 can be read back from fd 4
  return 0;
}
#
# Launch asynchronuously a curl process in a subshell.
# 
# Parameters: { URL } { indice }
#   URL : URL for the curl call.
#   indice : numeric identifier for this call
# Returns:
#   0 : Success.
#   1 : Missing parameters
#   2 : Failed to launch curl subprocess.
#   3 : Failed to access /proc
# STDOUT: PID of the corresponding subshell if success.
function CallCurl() {
  if [ $# != 2 ] ; then
    echo "CallCurl: URL and indice parameter are mandatory." 1>&2
    echo "          CallCurl { URL } { indice }." 1>&2
    return 1;
  fi
  [ ! -d /proc ] && return 3;
  local url="$1"
  local indice="$2"
  local subshell_PID
  # We launch our subshell for the subrprocess curl with its output
  # connected to the standalone anonymous pipe.
  # The curl process output is prefixed with its indice in the URL arrays.
  # Note that the subshell first renames itself with a specific identifier, 
  # curl_<indice>, and that we escape $BASHPID to use its pid for that :
  #   1) We can't use $$ to get the subshell PID as it is not a shell variable that
  #      can be evaluated at execution. As it is "immutable" from the shell point of
  #      view, it'll be always evaluated at first expansion, thus the parent shell PID.
  #   2) We don't rename after subshell launch using $! as its PID, at this time the
  #      subshell could have already terminated and its possible that another process
  #      have since been launched with this PID.
  # Note that we send its output elsewhere than STDOUT (to >&3), so it's non blocking.
  # Note also that we send USR1 signal to our parent shell ($$) when the command finishs.
  subshell_PID=$( { { local my_pid; 
                      eval my_pid="\${BASHPID}";
                      printf 'curl_%s' "${indice}">/proc/"${my_pid}"/comm 2>/dev/null;
                      curl -Is --connect-timeout 200 --max-time 200 "${url}" | head -1 |
                      { read -r line; echo "${indice}: ${line}"; };
                      kill -USR1 $$; 
                    } >&3 & 
                  } ; 
                  echo $!; )
  [ $? != 0 ] && return 2;
  echo "${subshell_PID}"
  return 0;
}
#
# Main URL processor, launch curl subprocess asynchronuously.
# 
# Parameters: { URL ... }
#   URL : URL to call with curl.
# Returns:
#   0 : Success.
#   1 : URL parameter(s) missing
#   2 : Failed to launch curl subprocess.
#   2 : Failed to create anonymous pipe.
# STDOUT: Processing and the outputs of the curl commands
function CurlProcessor() {
  if [ $# = 0 ] ; then
    echo "CurlProcessor: URL parameter is mandatory."  1>&2
    echo "               CurlProcessor { URL ... }." 1>&2
    return 1;
  fi
  local indice=0
  local isalive=0
  local -a URLarray
  # Feed the URL array
  while [ $# -gt 0 ] ; do URLarray+=("$1"); shift; done
  # Initialize a set of flags for each URL
  local -a ready
  for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do ready+=(0); done
  # Initialize an array of subshell PID for each URL to monitor
  local -a pid
  for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do pid+=(0); done
  # Initialize an array of subshell output for each URL 
  declare -a output
  for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do output+=(""); done
  # We create the anonymous pipe
  CreateAnonymousPipe
  [ $? != 0 ] && return 3;
  # Set a trap to catch USR1 and check which subshell are still alive through /proc
  # Local handler for the signals
  function trap_handler() {
    for indice in "${!pid[@]}" ; do
      if [ "${pid[${indice}]}" != "0" ] ; then 
        isalive="$(cat /proc/"${pid[${indice}]}"/comm 2>/dev/null)" 2>/dev/null; 
        [ "${isalive}" != "curl_${indice}" ] && ready[${indice}]=1;
      fi
    done
  }
  trap trap_handler USR1 2>/dev/null;
  # Now launch all the subshell
  for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do
    pid[${indice}]=$(CallCurl "${URLarray[${indice}]}" "${indice}"); 
    [ $? != 0 ] && return 2;
  done
  # We now wait that our subshells terminate.
  # Will waiting, we can do stuff. Here we just display "Waiting.." every seconds.
  local all_finished=0
  local num_finished=0
  local last_num_finished=0
  local direct_check_timer=0
  while [ "${all_finished}" = "0" ]; do
     # We check each URL subshell flag and loop till there is at least one unfinished.
     all_finished=1
     num_finished=0
     for ((indice=0; indice < ${#ready[@]}; indice++)) ; do 
       if [ ${ready[${indice}]} = 0 ] ; then
         all_finished=0; 
       else
         ((num_finished++));
       fi
     done
     echo "waiting for subshells.. ${num_finished}/${#ready[@]} finished.";
     sleep 1;
     # In case one or more subshell have crashed and thus wont send the USR1 signal, 
     # we launch here the handler to check the states of the subshells after 5sec
     # if there is no subshell termination in the interval.
     if [ "${all_finished}" = "0" ] ; then
       if [ "${last_num_finished}" = "${num_finished}" ] ; then
         ((direct_check_timer++))
         if [ "${direct_check_timer}" = "5" ] ; then
             echo "More than 5 seconds with no progress, doing a direct check."
             direct_check_timer=0 
             trap_handler
         fi
       else
         direct_check_timer=0 
       fi
     fi
     last_num_finished="${num_finished}"
  done;
  # All subshell have finished, we send EOF in the autonaumous pipe
  echo "EOF" >&3
  # We close fd 3 early 
  exec 3>&-
  # We recover our subshells outputs from the out point of the autonomous pipe
  local line=""
  local control=""
  while [ "${line}" != "EOF" ] ; do 
    read -r -u 4 line; 
    if [ "${line}" != "EOF" ] ; then
      # Each line should have "indice: " as a prefix to identify the URL associated
      indice="${line/: */}"
      if [ "${indice}" ] ; then
        control="${indice/[0-9]*/}"
        if [ "${control}" = "" ] ; then
          if [ "${output[${indice}]}" != "" ] ; then
            output[${indice}]="${output[${indice}]}\n${line/[0-9]*: /}"
          else
            output[${indice}]="${line/[0-9]*: /}"
          fi
        fi
      fi
    fi
  done
  # close the file descriptors when we are finished (optional)
  exec 4<&-
  # And display the output of the subshells
  echo "Subshells have all terminated, the output : ";
  for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do 
    echo "Output from URL ${URLarray[${indice}]} :"
    echo "${output[${indice}]}"
  done
  return 0;
}
#
# An example call of CurlProcessor
#
CurlProcessor "http://www.google.com" "http://stackoverflow.com/" "http://en.cppreference.com/"

通过示例调用，您将获得以下输出：



waiting for subshells.. 0/3 finished.
waiting for subshells.. 3/3 finished.
Subshells have all terminated, the output :
Output from URL http://www.google.com :
HTTP/1.1 200 OK
Output from URL http://stackoverflow.com/ :
HTTP/1.1 301 Moved Permanently
Output from URL http://en.cppreference.com/ :
HTTP/1.1 302 Found

什么时候很快就下来了 https://downdetector.com/status/fastly/news/392592-problems-at-fastly/，你会得到：



waiting for subshells.. 0/3 finished.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
More than 5 seconds with no progress, doing a direct check.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
More than 5 seconds with no progress, doing a direct check.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
waiting for subshells.. 2/3 finished.
More than 5 seconds with no progress, doing a direct check.
waiting for subshells.. 3/3 finished.
Subshells have all terminated, the output :
Output from URL http://www.google.com :
HTTP/1.1 200 OK
Output from URL http://stackoverflow.com/ :
HTTP/1.1 503 Backend unavailable, connection timeout
Output from URL http://en.cppreference.com/ :
HTTP/1.1 302 Found

（测试脚本的最佳时机^^。）

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

bash

shell