有时在 OpenShift 中启动 Postgres POD 时会显示以下错误代码
pg_ctl: another server might be running; trying to start server anyway
waiting for server to start....LOG: redirecting log output to logging
collector process
HINT: Future log output will appear in directory "pg_log".
..... done
server started
=> sourcing /usr/share/container-scripts/postgresql/start/set_passwords.sh ...
ERROR: tuple concurrently updated
答案1
解决此问题的方法:
- 找到处于崩溃循环中的 postgres pod 的名称。
- 开始
oc debug
与 pod 的会话。 - 将相关的 Postgres 部署扩展到零个 pod。
从调试会话的 cmd 行;
- 运行
run-postgresql
。这是CMD
docker 镜像的 。作为启动过程的一部分,脚本会创建一些文件,否则这些文件不会存在于 pod 中,即/var/lib/pgsql/openshift-custom-postgresql.conf
和/var/lib/pgsql/passwd
,这将阻止您运行任何pg_ctl
命令。运行命令时,您应该会看到上面列出的相同错误输出。 运行
pg_ctl stop -D /var/lib/pgsql/data/userdata
以彻底关闭 Postgres。您应该看到;waiting for server to shut down.... done server stopped
运行
pg_ctl start -D /var/lib/pgsql/data/userdata
以启动 Postgres。您应该看到以下输出,并且它应该无限期地等待(没有错误);server starting sh-4.2$ LOG: redirecting log output to logging collector process HINT: Future log output will appear in directory "pg_log".
按
enter
几次即可返回到 cmd 提示符。运行
pg_ctl stop -D /var/lib/pgsql/data/userdata
,然后等待 postgres 停止。这将确保干净关闭。waiting for server to shut down.... done server stopped
退出调试会话。
- 将部署规模扩大到 1 个 pod。Postgres 现在应该可以正常启动了。
- 运行
经过长时间的努力终于找到了解决方案:https://pathfinder-faq-ocio-pathfinder-prod.pathfinder.gov.bc.ca/DB/PostgresqlCrashLoopTupleError.html 致谢作者:Wade Barnes
答案2
您可能希望与最初计划的用户一起创建和运行调试 pod,否则在 pod 内运行命令时您将收到权限被拒绝的信息。
这是我执行的步骤顺序:
oc get -o yaml pod <postgresql-pod> | grep runAsUser
runAsUser: 1000650000
oc scale deployment/<postgresql-d> --replicas=0
deployment.apps/<postgresql-d> scaled
oc debug deployment/<postgresql-d> --as-user=1000650000
Starting pod/<postgresql-debug> ...
Pod IP: 10.128.2.75
If you don't see a command prompt, try pressing enter.
sh-4.2$ run-postgresql
pg_ctl: another server might be running; trying to start server anyway
waiting for server to start....2021-11-17 09:09:46.428 UTC [25] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-11-17 09:09:46.429 UTC [25] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-11-17 09:09:46.445 UTC [25] LOG: redirecting log output to logging collector process
2021-11-17 09:09:46.445 UTC [25] HINT: Future log output will appear in directory "log".
. done
server started
/var/run/postgresql:5432 - accepting connections
=> sourcing /usr/share/container-scripts/postgresql/start/set_passwords.sh ...
ERROR: tuple concurrently updated
sh-4.2$ pg_ctl stop -D /var/lib/pgsql/data/userdata
waiting for server to shut down.... done
server stopped
sh-4.2$ pg_ctl start -D /var/lib/pgsql/data/userdata
waiting for server to start....2021-11-17 09:10:19.359 UTC [45] LOG: listening on IPv4 address "0.0.0.0", port 5432
2021-11-17 09:10:19.359 UTC [45] LOG: listening on IPv6 address "::", port 5432
2021-11-17 09:10:19.369 UTC [45] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-11-17 09:10:19.377 UTC [45] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-11-17 09:10:19.558 UTC [45] LOG: redirecting log output to logging collector process
2021-11-17 09:10:19.558 UTC [45] HINT: Future log output will appear in directory "log".
done
server started
sh-4.2$
sh-4.2$
sh-4.2$
sh-4.2$ pg_ctl stop -D /var/lib/pgsql/data/userdata
waiting for server to shut down.... done
server stopped
sh-4.2$ exit
exit
Removing debug pod ...
oc scale deployment/<postgresql-d> --replicas=1
deployment.apps/<postgresql-d> scaled