我正在恢复大小为 1200GB 的 Oracle 数据文件。数据文件从大小为 4000GB (4TB) 的备份片段中读取。
注意:据我监控,总共处理了 4TB 以读取我的数据文件的所有块。
恢复过程中我遇到了几个问题:
情况 1. 有一段时间,2 小时后恢复失败(此时 Netbackup 数据库控制台中的操作列为空)
情况 2. 驱动器在 2 小时前安装,恢复过程在 2 小时前未失败。但在处理 55GB 后突然恢复失败。
我没有旧日志。现在我遇到以下问题:
- 处理 3TB 后恢复失败(已过去 23 小时)...我将发布一些日志条目,请帮忙:
---------------详细信息选项卡显示
2012 年 8 月 23 日下午 12:56:40 - 已恢复映像 r2n1_1345046880 -(文件读取失败(13));恢复时间 23:36:17 2012 年 8 月 23 日下午 12:56:44 - 结束恢复;已用时间:23:36:26 恢复无法恢复请求的文件(5)
------------------------------bpbrm 日志显示
12:56:35.561 [7344.4804] <2> bpbrm brm_child_done: child 4852 exited with status 13: file read failed
12:56:35.561 [7344.4804] <2> bpbrm send_status_to_parent: bpbrm child is done, but the media manager child is not.
12:56:35.561 [7344.4804] <2> bpbrm tell_mm: sending media manager msg: STOP RESTORE r2n1_1345046880
12:56:40.709 [7344.4804] <2> bpbrm read_media_msg: read from media manager: EXIT r2n1_1345046880 150
12:56:40.709 [7344.4804] <2> bpbrm process_media_msg: media manager for backup id r2n1_1345046880 exited with status 150: termination requested by administrator
12:56:40.709 [7344.4804] <2> bpbrm send_status_to_parent: EXIT r2n1_1345046880 13 sent to parent process.
12:56:40.709 [7344.4804] <2> bpbrm read_media_msg: read from media manager: MEDIA NOT READY
12:56:40.771 [7344.4804] <2> bpbrm read_parent_msg: read from parent TERMINATE
12:56:40.771 [7344.4804] <2> bpbrm tell_mm: sending media manager msg: TERMINATE
--------------bprd
12:56:40.725 [3292.7188] <2> read_bpbrm_msg: (pid 7344) EXIT r2n1_1345046880 13
12:56:40.725 [3292.7188] <2> handle_image_status: r2n1_1345046880 restfiles pid 6696 bpbrm pid 7344 status = 13
12:56:40.725 [3292.7188] <2> remove_completed_restore: unlink record with restfiles pid 6696
12:56:40.725 [3292.7188] <2> remove_completed_restore: no more active restores
12:56:40.725 [3292.7188] <2> done_with_bpbrm: bpbrm pid 7344 has no more images to process
12:56:40.725 [3292.7188] <2> send_bpbrm_msg: (pid 7344) TERMINATE
12:56:40.725 [3292.7188] <2> start_indep_grp: active_mpx_groups = 0
12:56:40.725 [3292.7188] <2> start_indep_grp: p_active_restore_for_first_grp = NULL
12:56:40.725 [3292.7188] <2> start_indep_grp: p_first_mpx_group_for_restore = NULL
12:56:40.725 [3292.7188] <2> next_image_using_first_blks: returning NULL
12:56:40.725 [3292.7188] <2> run_restores: brm pid 0 has no active restores and none to start
12:56:40.725 [3292.7188] <2> start_indep_grp: active_mpx_groups = 0
12:56:40.725 [3292.7188] <2> start_indep_grp: p_active_restore_for_first_grp = NULL
12:56:40.725 [3292.7188] <2> start_indep_grp: p_first_mpx_group_for_restore = NULL
12:56:40.725 [6696.7284] <2> restorefiles: Done with port 0 backup_id r2n1_1345046880. Status = 13
12:56:40.725 [6696.7284] <2> add_msgs_to_progress_file: close and re-open the progress log file
12:56:40.725 [6696.7284] <2> log_disconnect: more than 30 mins since the last progress msg, closing sockets
12:56:40.725 [6696.7284] <2> put_short: (10) network write() error: An operation was attempted on something that is not a socket. ; socket = -1
12:56:40.725 [6696.7284] <2> bpcr_disconnect_rqst: bpcr protocol error - couldn't send request type
12:56:40.725 [6696.7284] <2> log_disconnect_all: Couldn't disconnect from bpcd
12:56:40.725 [6696.7284] <2> log_disconnect_all: Couldn't close CLIENT_CMD_SOCK: An operation was attempted on something that is not a socket.
12:56:40.725 [6696.7284] <2> log_disconnect_all: Couldn't close CLIENT_STAT_SOCK: An operation was attempted on something that is not a socket.
12:56:40.756 [6696.7284] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2054: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
12:56:40.756 [6696.7284] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2068: service: bpcd
12:56:40.803 [6696.7284] <2> logconnections: BPCD CONNECT FROM 192.168.59.63.52068 TO 192.168.34.150.13724
12:56:40.803 [6696.7284] <2> vnet_connect_to_vnetd_extra: ../../libvlibs/vnet_vnetd.c.188: msg: VNETD CONNECT FROM 192.168.59.63.52069 TO 192.168.34.150.13724 fd = 808
12:56:40.834 [6696.7284] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.541: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
12:56:40.896 [6696.7284] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.558: ipc_string: /tmp/vnet-32625345712215021290000000000-ERMDYb
12:56:41.754 [6696.7284] <2> get_short: (2) premature end of file (byte 1)
12:56:41.754 [6696.7284] <2> readclientstatus: bpcr short int read failed
12:56:41.754 [6696.7284] <2> readclientstatus: Access is denied.
12:56:41.754 [6696.7284] <2> log_disconnect: Couldn't close progress file /usr/openv/netbackup/logs/user_ops/dbext/logs/17121.0.1345627225 on orcl.justice.gov.ge
12:56:41.754 [6696.7284] <2> log_disconnect: Client close errno = 1
12:56:41.754 [6696.7284] <2> put_short: (10) network write() error: An existing connection was forcibly closed by the remote host. ; socket = 780
12:56:41.754 [6696.7284] <2> bpcr_disconnect_rqst: bpcr protocol error - couldn't send request type
12:56:41.754 [6696.7284] <2> log_disconnect_all: Couldn't disconnect from bpcd
12:56:41.754 [6696.7284] <2> log_disconnect_all: Couldn't close CLIENT_CMD_SOCK: An operation was attempted on something that is not a socket.
12:56:41.754 [6696.7284] <2> log_disconnect_all: Couldn't close CLIENT_STAT_SOCK: An operation was attempted on something that is not a socket.
12:56:41.754 [6696.7284] <2> restorefiles: Received finished from MPX main bprd. Stat = 0
12:56:41.754 [6696.7284] <2> mail_msg_and_set_exit_status: entered; status = 0
12:56:41.754 [6696.7284] <2> mail_msg_and_set_exit_status: Attempting to send mail to oracle on orcl.justice.gov.ge
12:56:41.770 [6696.7284] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2054: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
12:56:41.770 [6696.7284] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2068: service: bpcd
12:56:41.817 [6696.7284] <2> logconnections: BPCD CONNECT FROM 192.168.59.63.52070 TO 192.168.34.150.13724
12:56:41.817 [6696.7284] <2> vnet_connect_to_vnetd_extra: ../../libvlibs/vnet_vnetd.c.188: msg: VNETD CONNECT FROM 192.168.59.63.52071 TO 192.168.34.150.13724 fd = 780
12:56:41.863 [6696.7284] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.541: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
12:56:41.879 [6696.7284] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.558: ipc_string: /tmp/vnet-32628345712216006586000000000-fgV51f
12:56:42.285 [6696.7284] <2> mail_msg_and_set_exit_status: CLIENT_CMD_SOCK from bpcr = 808
12:56:42.285 [6696.7284] <2> mail_msg_and_set_exit_status: CLIENT_STAT_SOCK from bpcr = 780
12:56:42.690 [6696.7284] <2> bpcr_get_version_rqst: bpcd version: 07000000
12:56:42.909 [6696.7284] <2> bpcr_get_version_rqst: bpcd version: 07000000
12:56:43.330 [6696.7284] <2> bpcr_get_version_rqst: bpcd version: 07000000
12:56:43.767 [6696.7284] <2> mail_msg_and_set_exit_status: RESTORE EXIT STATUS = 5
12:56:43.782 [6696.7284] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2054: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
12:56:43.782 [6696.7284] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2068: service: bpcd
12:56:43.829 [6696.7284] <2> logconnections: BPCD CONNECT FROM 192.168.59.63.52092 TO 192.168.34.150.13724
12:56:43.829 [6696.7284] <2> vnet_connect_to_vnetd_extra: ../../libvlibs/vnet_vnetd.c.188: msg: VNETD CONNECT FROM 192.168.59.63.52093 TO 192.168.34.150.13724 fd = 808
12:56:43.860 [6696.7284] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.541: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
12:56:43.907 [6696.7284] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.558: ipc_string: /tmp/vnet-32635345712218044217000000000-EOpovo
12:56:44.750 [6696.7284] <2> job_monitoring_exex: ACK disconnect
12:56:44.750 [6696.7284] <2> job_disconnect: Disconnected
12:56:44.750 [6696.7284] <2> get_short: (2) premature end of file (byte 1)
12:56:44.750 [6696.7284] <2> readclientstatus: bpcr short int read failed
12:56:44.750 [6696.7284] <2> readclientstatus: Access is denied.
12:56:44.750 [6696.7284] <2> log_disconnect: Couldn't close progress file /usr/openv/netbackup/logs/user_ops/dbext/logs/17121.0.1345627225 on orcl.justice.gov.ge
12:56:44.750 [6696.7284] <2> log_disconnect: Client close errno = 1
12:56:44.750 [6696.7284] <2> bpcr_disconnect_rqst: bpcr received the following message:
12:56:44.750 [6696.7284] <2> terminate_restore: EXIT STATUS 5
12:56:44.750 [6696.7284] <2> bprd: C:\Program Files\Veritas\NetBackup\bin\userreq_notify.cmd restore orcl.justice.gov.ge oracle
----------------------RMAN
failure of IRESTORE command on ch1_tape channel at 08/23/2012 12:57:17
RMAN-10038: database session for channel ch1_tape terminated unexpectedly
请注意,恢复并未被手动停止。
答案1
RMAN-10038 通常意味着负责处理恢复的 Oracle 进程(与 Netbackup 的库链接)中止,可能出现核心转储。检查警报*.log 并在 Oracle 诊断目录中查找 *.trc 文件。
这很可能是一个已知问题http://www.symantec.com/business/support/index?page=content&id=TECH77071
可在以下链接获取用于为 NetBackup 6.5.5 启用 Oracle 11.2 的紧急工程二进制文件 (EEB): http://support.veritas.com/docs/346033
对于 NetBackup 6.5.4: http://support.veritas.com/docs/346592
尝试在未进行这些修复的情况下备份 Oracle 版本 11.2 将导致以下失败:
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ch00 channel at 07/27/2010 09:23:23
RMAN-10038: database session for channel ch00 terminated unexpectedly
除此之外,我会首先联系 Oracle 支持,然后联系 Netbackup 支持。