PostgreSQL + 从数据库中删除旧主机失败

PostgreSQL + 从数据库中删除旧主机失败

我们有一个包含 152 台工作机器的 HDP 集群 - worker1.duplex.com.. worker152.duplex.com,所有机器都安装在 RHEL 7.9 版本上

我们正在尝试删除最后一个主机 -worker152.duplex.com从 Ambari 服务器或实际上从 PostgreSQL DB 中删除,如下所示

首先我们需要找到host_id

select host_id from hosts where host_name='worker152.duplex.com';

and host_id is:

 host_id
---------
      51
(1 row)

现在我们删除这个host_id- 51

delete from execution_command where task_id in (select task_id from host_role_command where host_id in (51));
delete from host_version where host_id in (51);
delete from host_role_command where host_id in (51);
delete from serviceconfighosts where host_id in (51);
delete from hoststate where host_id in (51);
delete from kerberos_principal_host WHERE host_id='worker152.duplex.com';
delete from hosts where host_name in ('worker152.duplex.com');
delete from alert_current where history_id in ( select alert_id from alert_history where host_name in ('worker152.duplex.com'));

现在我们验证host_id- 代表主机的 51 -worker152.duplex.com不存在通过以下验证

ambari=> select host_name,  public_host_name  from hosts;
        host_name         |     public_host_name
--------------------------+--------------------------
worker1.duplex.com
.
.
.
worker151.duplex.com

正如我们上面看到的,主机worker151.duplex.com不存在,这很好,而且确实似乎该主机worker151.duplex.com已从 PostgreSQL DB 中删除

现在我们重新启动Ambari-server以使它生效(它还会重新启动 PostgreSQL 服务)

ambari-server restart
Using python  /usr/bin/python
Restarting ambari-server
Waiting for server stop...
Ambari Server stopped
Ambari Server running with administrator privileges.
Organizing resource files at /var/lib/ambari-server/resources...
Ambari database consistency check started...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Waiting for server start.........................
Server started listening on 8080

DB configs consistency check: no errors and warnings were found.

Ambari 服务器启动后,我们很惊讶,因为host_id-51 或 host -worker152.duplex.com仍然存在,如下所示

ambari=> select host_name,  public_host_name  from hosts;
        host_name         |     public_host_name
--------------------------+--------------------------
worker1.duplex.com
.
.
.
worker152.duplex.com

我们不明白为什么尽管我们删除了此记录,但该主机仍再次出现

我们还尝试通过以下方法删除历史数据,但这没有帮助

ambari-server db-purge-history --cluster-name hadoop7 --from-date 2024-01-01

Using python  /usr/bin/python
Purge database history...
Ambari Server configured for Embedded Postgres. Confirm you have made a backup of the Ambari Server database [y/n]yes
ERROR: The database purge historical data cannot proceed while Ambari Server is running. Please shut down Ambari first.
Ambari Server 'db-purge-history' completed successfully.
  1. 为什么主机Ambari-server重启后返回?

  2. 我们的删除过程出了什么问题?

PostgreSQL 版本:

postgres=# SHOW server_version;
 server_version
----------------
 9.2.24
(1 row)

链接:

https://www.andruffsolutions.com/removing-old-host-data-from-ambari-server-and-tuning-the-database/

https://community.cloudera.com/t5/Support-Questions/how-to-remove-old-registered-hosts-from-DB/mp/217524/highlight/true

相关内容