我的集群中基于bitnami/postgres
Helm Chart 运行着一个 Postgres 数据库。
然而,查看这个已有 18 天的吊舱,它在这段时间内已经重启了 39 次。
检查以前的日志,我看到以下内容:
postgresql 09:06:19.75
postgresql 09:06:19.84 Welcome to the Bitnami postgresql container
postgresql 09:06:19.86 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-postgresql
postgresql 09:06:19.88 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-postgresql/issues
postgresql 09:06:19.90
postgresql 09:06:20.30 INFO ==> ** Starting PostgreSQL setup **
postgresql 09:06:20.36 INFO ==> Validating settings in POSTGRESQL_* env vars..
postgresql 09:06:20.38 INFO ==> Loading custom pre-init scripts...
postgresql 09:06:20.49 INFO ==> Initializing PostgreSQL database...
postgresql 09:06:30.04 INFO ==> Cleaning stale /bitnami/postgresql/data/postmaster.pid file
postgresql 09:06:30.17 INFO ==> pg_hba.conf file not detected. Generating it...
postgresql 09:06:30.18 INFO ==> Generating local authentication configuration
postgresql 09:06:30.28 INFO ==> Deploying PostgreSQL with persisted data...
postgresql 09:06:30.48 INFO ==> Configuring replication parameters
postgresql 09:06:30.69 INFO ==> Configuring fsync
postgresql 09:06:30.94 INFO ==> Loading custom scripts...
postgresql 09:06:30.96 INFO ==> Enabling remote connections
postgresql 09:06:31.06 INFO ==> ** PostgreSQL setup finished! **
postgresql 09:06:31.18 INFO ==> ** Starting PostgreSQL **
2021-03-27 09:06:31.811 GMT [1] LOG: pgaudit extension initialized
2021-03-27 09:06:31.813 GMT [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2021-03-27 09:06:31.813 GMT [1] LOG: listening on IPv6 address "::", port 5432
2021-03-27 09:06:32.015 GMT [1] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-03-27 09:06:32.285 GMT [97] LOG: database system was interrupted; last known up at 2021-03-26 19:29:14 GMT
2021-03-27 09:06:33.428 GMT [103] FATAL: the database system is starting up
2021-03-27 09:06:34.350 GMT [97] LOG: database system was not properly shut down; automatic recovery in progress
2021-03-27 09:06:34.433 GMT [97] LOG: redo starts at 0/32A7EE0
2021-03-27 09:06:34.433 GMT [97] LOG: invalid record length at 0/32A7FC0: wanted 24, got 0
2021-03-27 09:06:34.434 GMT [97] LOG: redo done at 0/32A7F88
2021-03-27 09:06:34.782 GMT [1] LOG: database system is ready to accept connections
从日志中可以看出,没有什么特别的错误。事实上,这个日志与当前容器的日志基本相同。它确实说数据库被中断了,但我不确定为什么会中断。
检查 pod 后没有发现任何关于其重新启动原因的事件。
那么为什么 Pod 会频繁重启呢?
编辑
描述 pod 显示退出代码 255。
liveness 和 rediness 探测是 Helm Chart 的默认探测,如下所述:
Liveness: exec [/bin/sh -c exec pg_isready -U "postgres" -h 127.0.0.1 -p 5432] delay=30s timeout=5s period=10s #success=1 #failure=6
Readiness: exec [/bin/sh -c -e exec pg_isready -U "postgres" -h 127.0.0.1 -p 5432
[ -f /opt/bitnami/postgresql/tmp/.initialized ] || [ -f /bitnami/postgresql/.initialized ] ] 延迟=5s 超时=5s 周期=10s #成功=1 #失败=6