Cassandra Bulk Loader 永远不会因空指针异常而退出,因此我的调用 shell 脚本不会看到失败

Cassandra Bulk Loader 永远不会因空指针异常而退出,因此我的调用 shell 脚本不会看到失败

我创建了一个使用进程调用循环处理文件的脚本。我检查该调用的退出代码,看看是否应该移动文件(成功时)。问题是,当进程因异常而失败时,它永远不会退出。我该如何检测发生的异常,以便让脚本继续处理下一个文件?

脚本的相关部分

# Stream data
sstableloader -d $3 $tablepathfull

# On success, move data to target dir
if [[ $? != 0 ]]; then
    echo "Error: Table failed - $tablepathfull"
else
    echo "Table OK - $tablepathfull"
    trgtdir="$2/$hostname/$keyspacename/$typename/$timestamp/$keyspacename/$tablename"
    mkdir -p $trgtdir
    mv $tablepathfull/* $trgtdir
    rmdir $tablepathfull
fi

如果没有“官方”方式,是否有可能捕获进程调用的输出(见下文),并在发生异常时简单地终止该进程?

异常输出

Exception in thread "STREAM-OUT-/XX.XX.XXX.88" Exception in thread "STREAM-OUT-/XX.XX.XXX.92" java.lang.NullPointerException
    at org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:249)
    at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:375)
    at java.lang.Thread.run(Thread.java:744)
java.lang.NullPointerException
    at org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:249)
    at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:375)
    at java.lang.Thread.run(Thread.java:744)

答案1

我能想到的唯一解决方法是使用子流程和文件:

TEMP_FILE='/tmp/some_file.txt'

function load_table() {
  if [ $# -lt 2 ]; then
    printf "1" > "${TEMP_FILE}"
    return 1
  fi

  local param1="$1"
  local table_full_path="$2"
  local exit_code

  # Stream data
  sstableloader -d "${param1}" "${table_full_path}" >> "${TEMP_FILE}"
  exit_code=$?

  printf "\n%s" "${exit_code}" >> "${TEMP_FILE}"
}

function is_process_running() {
  if [ $# -eq 0 ]; then
    return 1
  fi
  local process_id="$1"

  ps aux | sed -r 's/[ ]+/ /g' | cut -d' ' -f2 | grep -q "${process_id}"
  return $?
}

function exceptions_count() {
  local count=$(tail -10 "${TEMP_FILE}" | grep -c "Exception")
  return $count
}


load_table "$3" "${tablepathfull}" &

# Given you have one subprocess only.. get the pid of the first subprocess in the list
job_pids=( $(jobs -p) )
load_table_job_pid=${job_pids[0]}

while is_process_running "${load_table_job_pid}" && exceptions_count -eq 0; do
  sleep 5
done

exit_code=0
if is_process_running "${load_table_job_pid}"; then
  local load_table_job_gid=$(ps x -o  "%p %r %y %x %c " | sed -r -e 's/[ ]+/ /g' -e 's/^[ ]+//g' | grep -E "^${load_table_job_pid} " | cut -d' ' -f2)
  kill -TERM -$load_table_job_gid >/dev/null 2>&1
  exit_code=1
else
  exit_code=$(tail -1 "${TEMP_FILE}")
fi

rm -f "${TEMP_FILE}"

# Your code
# On success, move data to target dir
if [ $exit_code -ne 0 ]; then
    echo "Error: Table failed - $tablepathfull"
else
    echo "Table OK - $tablepathfull"
    trgtdir="$2/$hostname/$keyspacename/$typename/$timestamp/$keyspacename/$tablename"
    mkdir -p $trgtdir
    mv $tablepathfull/* $trgtdir
    rmdir $tablepathfull
fi

您可以通过添加重试计数或其他方式来改进代码。

相关内容