Thursday, November 19, 2015

How to handle RMAN DUPLICATE: Errors in krbm_getDupCopy during duplicate

I recently cloned a 11.2.0.4 database using tape backups as source.

The clone failed right before "open resetlogs". In other words, the database was restored and recovered, but DUPLICATE didn't succeed in opening it. So I did it manually, by recreating the controlfile using a template from the source database, and adjusted for the file layout on the auxiliary. The database opened nicely.

However, later during the day, the application team found that a new clone was needed due to some synch issue caused by a third-party replication tool. In other words, the entire database had to be copied from the latest version of our incremental level 1 backups and restored all over again.

No problem.

As usual, I started by saving away the spfile, password file and the block change tracking file located in $ORACLE_HOME/dbs. Following that, I mounted and dropped the database. All the files were automatically removed from disk by Oracle, and the instance terminated, as expected.

As I started to clone, I noticed some new messages in the alert log of the auxiliary database:

RMAN DUPLICATE: Errors in krbm_getDupCopy
Errors in file /u01/oracle/product/11204/admin/testdb01/diag/rdbms/testdb01/testdb01/trace/testdb01_ora_22544486.trc:
ORA-19625: error identifying file /u02/oradata/testdb01/system01.dbf
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or directory
Additional information: 3

This error was repeated for every data file in the database and appended to the same trace file as in the message above.

I found the errors worth checking.

Fortunately, these errors can be ignored. MOS Doc ID 1476641.1 "RMAN DUPLICATE: Errors In Krbm_getDupCopy found in alert.log" explains

"This happens after a previous failed duplicate trial, if the files copied to auxiliary destination have been deleted."


and


"The messages in alert.log indicate that rman cannot use a previous datafile copy for those files and that the files must be copied again to auxiliary destination. Duplicate checks if there are datafile copies already created by a duplicate to avoid restoring the files again.


So this is a part of Oracle restore optimization concept:

"...from 11.2 RMAN always checks if the file copy exists at destination hosts to avoid copying it again"


Great, but how does RMAN do that?

"If duplicate fails between the first restore/copy and the zeroing of the dbid, a second duplicate will find the _rm_dup_@.dat file and the clone default channel will read it into memory, once determined the name of the datafilecopy to be created by the second duplicate it is compared with the existing datafilecopy from the previous duplicate. If the datafilecopy still exists and matches the vital information of the datafile (file number, database id, creation scn, database name) and its checkpoint scn is behind the until scn then the datafilecopy can be used by this new duplicate and restore/copy isnot necessary"


The solution is simple:

"As files have already been deleted from auxiliary destination, ignore those messages."

I could also have followed the following advice:

"If you don't want to see those messages in alert.log but datafiles have already been deleted, on Auxiliary host, delete the file $ORACLE_HOME/dbs/_rm_dup_.dat where dup_db is the name of the clone instance."


1 comment:

  1. Saved my ass on a long and busy Saturday shift. Thanks!

    ReplyDelete