Showing posts with label RMAN. Show all posts
Showing posts with label RMAN. Show all posts

Wednesday, October 9, 2024

Workaround for ORA-27069 attempt to do I/O beyond the range of the file during RMAN clone from active database

This is the message I received when starting an active database duplication using RMAN:
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 10/09/2024 10:40:19
RMAN-04014: startup failed: ORA-27069: attempt to do I/O beyond the range of the file
The rman script:
connect target sys/pwd@mydb1
connect auxiliary sys/pwd@mydb2
run{
allocate channel c1 type disk;
allocate channel c2 type disk;
allocate auxiliary channel aux1 type disk;
allocate auxiliary channel aux2 type disk;
configure device type disk parallelism 2;
debug io;
DUPLICATE TARGET DATABASE TO mydb2
FROM ACTIVE DATABASE
USING COMPRESSED BACKUPSET;
debug off;
}
Solution:
  • Usa a pfile to start the auxiliary instance, and *not* an spfile. If you do not have one, log onto the not-yet-mounted instance and create one:
    sqlplus / as sysdba
    create spfile from pfile;
    
  • Set the environment variable export ORA_RMAN_SGA_TARGET to the same (or a little more) as the auxiliary database's total SGA:
    export ORA_RMAN_SGA_TARGET=7900
    Run the rman script again and it should proceed past the mount-stage and start restoring the files.
  • Thursday, July 11, 2024

    How to solve RMAN-08137: warning: archived log not deleted, needed for standby or upstream capture process

    In one of my cloned databases, the FRA was filling up and pushing towards the limit of 2 TB.

    Since it was cloned from a production database using Golden Gate, the capture processes followed along, but was not dropped in the cloned database.

    This caused the deletion policy in RMAN to be overriden because Oracle thinks it will need the archivelogs for Golden Gate, even though no Golden Gate is configured for this particular database.

    When deleting an archivelog, RMAN would throw an error:
    RMAN> delete archivelog sequence 47447;
    
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=1352 device type=DISK
    allocated channel: ORA_DISK_2
    channel ORA_DISK_2: SID=1354 device type=DISK
    RMAN-08137: warning: archived log not deleted, needed for standby or upstream capture process
    archived log file name=/u04/fra/CDB/archivelog/2024_03_13/o1_mf_1_47447__j5wx4r32_.arc thread=1 sequence=47447
    
    The query below revealed the name of the blocking capture processes:
    sqlplus / as sysdba --> log on to root container
    SYS@_container_name SQL>  SELECT CAPTURE_NAME,
                CAPTURE_TYPE, STATUS,
                to_char(REQUIRED_CHECKPOINT_SCN,'999999999999999')as REQ_SCN ,
                to_char(OLDEST_SCN,'999999999999999')as OLDEST_SCN
         FROM DBA_CAPTURE;   2    3    4    5
    
    CAPTURE_NAME         CAPTURE_TY STATUS   REQ_SCN          OLDEST_SCN
    -------------------- ---------- -------- ---------------- ----------------
    OGG$CAP_MYDB2ABC     LOCAL      DISABLED     426508588124     426508588124
    OGG$CAP_MYDB2DEF     LOCAL      DISABLED     426508561845     426508561845
    
    Solution was to use the package dbms_capture_adm and drop the captures:
    sqlplus / as sysdba --> log on to root container
    
    SYS@_container_name SQL> exec dbms_capture_adm.drop_capture('OGG$CAP_MYDB2ABC');
    SYS@_container_name SQL> exec dbms_capture_adm.drop_capture('OGG$CAP_MYDB2DEF');
    
    Thanks to Bobby Curtis for pointing me in the righ direction with his post about the same topic

    Tuesday, October 3, 2023

    What to do if your flash recovery area is filled with "foreign archive logs"?

    One of my databases came to a halt when the flash recovery area filled completely.

    It wasn't a problem with space preassure, but merely a condition that made the flash recovery area *seem* full, when it really wasn't.

    To clear the archiver stuck condition is easy enough, simply increase the value of db_recovery_file_dest_size, but how to fix the source of the problem?

    My flash recovery area was, at the time, filled up with foreign archive logs:
    SYS@CDB$ROOT SQL> select * from v$flash_recovery_area_usage;
    
    FILE_TYPE               PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES     CON_ID
    ----------------------- ------------------ ------------------------- --------------- ----------
    CONTROL FILE                             0                         0               0          0
    REDO LOG                                 0                         0               0          0
    ARCHIVED LOG                         14.04                         0             235          0
    BACKUP PIECE                           .06                         0              44          0
    IMAGE COPY                               0                         0               0          0
    FLASHBACK LOG                            0                         0               0          0
    FOREIGN ARCHIVED LOG                  85.8                         0            1558          0
    AUXILIARY DATAFILE COPY                  0                         0               0          0
    
    8 rows selected.
    
    To fix this, log on to the database using rman:
    rman target / nocatalog
    
    Here you can list the foreign archivelogs:
    list foreign archivelogs all;
    
    They were all from may 2023, and they have ended up here since the database was cloned using storage snapshots.

    To clear them out of the flash recovery area, you need to first crosscheck them:
    RMAN> crosscheck foreign archivelog all;
    
    Then, delete them:
    RMAN> delete noprompt foreign archivelog all ;
    
    The crosscheck is important. Without it, Oracle cannot delete the entries from the controlfile, which means that tha v$flash_recovery_area will still be reported as full.
    The flash recovery area is not full anymore:
    SYS@CDB$ROOT SQL> select * from v$flash_recovery_area_usage;
    
    FILE_TYPE               PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES     CON_ID
    ----------------------- ------------------ ------------------------- --------------- ----------
    CONTROL FILE                             0                         0               0          0
    REDO LOG                                 0                         0               0          0
    ARCHIVED LOG                         18.21                         0             280          0
    BACKUP PIECE                           .06                         0              44          0
    IMAGE COPY                               0                         0               0          0
    FLASHBACK LOG                            0                         0               0          0
    FOREIGN ARCHIVED LOG                     0                         0               0          0
    AUXILIARY DATAFILE COPY                  0                         0               0          0
    
    8 rows selected.
    

    Documented in Doc ID 1617965.1 "What commands may be used to remove foreign archivelog files?" at Oracle Support.

    Thursday, February 3, 2022

    Observation: rman saves files according to end date of backup

    I noticed the following: An archivelog file backed up at 1.01.2022 23:57:18 will NOT be saved in the folder for 31.01.2022. Instead, it will be saved in the folder for 01.02.2022.
    Output from RMAN:
     list archivelog from time '31.01.2022' until time '01.02.2022';
     
    Output (excerpt)
    450706  1    450701  A 31.01.2022 23:56:46
            Name: /u04/fra/proddb01/archivelog/2022_01_31/o1_mf_1_450701__vwojnvs6_.arc
    
    450707  1    450702  A 31.01.2022 23:57:16
            Name: /u04/fra/proddb01/archivelog/2022_01_31/o1_mf_1_450702__vwokkx0p_.arc
    
    450708  1    450703  A 31.01.2022 23:57:18
            Name: /u04/fra/proddb01/archivelog/2022_02_01/o1_mf_1_450703__vx4cmycs_.arc
    

    The file /u04/fra/proddb01/archivelog/2022_02_01/o1_mf_1_450703__vx4cmycs_.arc has the timestamp Feb 1 00:05
    So in this case, the last file generated on 31.01 actually ended up in the folder for files generated on the 01.02

    Tuesday, April 13, 2021

    How to solve "RMAN-04014: startup failed: ORA-01261: Parameter db_recovery_file_dest destination string cannot be translated" during cloning

    During an attempt to clone for active database I had created pfile with my desired value for the parameter db_recovery_file_dest, which was /fradata. Excerpt from init.ora used to start the instance (nomount):
    cat initproddb01.ora |grep db_recovery_file_dest
    
    *.db_recovery_file_dest='/fradata'
    
    This directory exists on my auxilliary server and is indeed writable by the oracle software installation owner (for most installation, this user is called "oracle") After having started the instance, and executed the clone script:
    DUPLICATE TARGET DATABASE
    FOR STANDBY
    FROM ACTIVE DATABASE
    DORECOVER
    SPFILE
    SET DB_UNIQUE_NAME='STBY01' COMMENT 'Stanby db 1'
    USING COMPRESSED BACKUPSET
    NOFILENAMECHECK;
    
    This message appears:
    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    RMAN-03002: failure of Duplicate Db command at 04/13/2021 08:07:31
    RMAN-05501: aborting duplication of target database
    RMAN-03015: error occurred in stored script Memory Script
    RMAN-04014: startup failed: ORA-01261: Parameter db_recovery_file_dest destination string cannot be translated
    ORA-01262: Stat failed on a file destination directory
    Linux-x86_64 Error: 2: No such file or directory
    
    Cause: The spfile that is created contains the target's db_recovery_file_dest setting, not the auxiliary's:
    strings spfileproddb01.ora | grep db_recovery_file_dest
    *.db_recovery_file_dest='/FRA' <--- this is inherited from the target database, and doesn't exist on the auxiliary server
    
    Solution: Add a new spfile directive to your clone script:
    SET DB_RECOVERY_FILE_DEST='/fradata'
    
    and rerun.

    Wednesday, December 16, 2020

    How to fix error SYS.DBMS_BACKUP_RESTORE version 18.11.00.00 in TARGET database is not current

    Even if you're not using a recovery catalog, you may encounter this issue if you have recently patched your database. Simply by connecting to the target database as sysdba, you can verify that you need to update your internal packages:
    oracle@myserver.mydomain.com:[proddb01]# rman target / nocatalog
    
    Recovery Manager: Release 18.0.0.0.0 - Production on Wed Dec 16 13:51:43 2020
    Version 18.12.0.0.0
    
    Copyright (c) 1982, 2018, Oracle and/or its affiliates.  All rights reserved.
    
    PL/SQL package SYS.DBMS_BACKUP_RESTORE version 18.11.00.00 in TARGET database is not current
    PL/SQL package SYS.DBMS_RCVMAN version 18.11.00.00 in TARGET database is not current
    connected to target database: PRODDB01 (DBID=1234567890)
    using target database control file instead of recovery catalog
    
    Solution: Follow Doc ID 888818.1: "Rman 06190 Connecting to target database after upgrade" In short, connect to the database as sysdba, and run the following scripts:
    SQL> @$ORACLE_HOME/rdbms/admin/dbmsrman.sql
    SQL> @$ORACLE_HOME/rdbms/admin/prvtrmns.plb   
    SQL> @$ORACLE_HOME/rdbms/admin/dbmsbkrs.sql 
    SQL> @$ORACLE_HOME/rdbms/admin/prvtbkrs.plb
    

    Tuesday, July 21, 2020

    How to fix RMAN-04006 when cloning from active database




    Background:
    During an attempt to clone database using active duplication, I received an error when testing my connectivity to the target and auxiliary server. I normally do a testrun first, to see if I can connect without problems, before I start my cloning script.

    My script to test the connection is simple and called "testconnection.cmd".
    The connections are made through a wallet, so that you do not have to expose your passwords in any scripts or any shell prompts. See my previous post for details about setting up a wallet.
    connect target /@proddb01.oric.no
    connect auxiliary /@testdb01.oric.no
    run{
    allocate channel c1 type disk;
    allocate channel c2 type disk;
    allocate auxiliary channel aux1 type disk;
    allocate auxiliary channel aux2 type disk;
    configure device type disk parallelism 2;
    }
    exit
    

    Execute it as follows:
    rman cmdfile=testconnection.cmd
    

    connected to target database: prodb01 (DBID=2078894010, not open)
    connected to auxiliary database (not started)
    
    RMAN-04006: error from auxiliary database: ORA-01034: ORACLE not available
    ORA-27101: shared memory realm does not exist
    

    Cause:
    Incorrect password in the auxiliary database

    Solution:
    Copy the password file from your target database server to your auxiliary database server and retry the operation.

    When all is good, the output from the auxiliary database should be
    connected to auxiliary database: testdb01 (not mounted)
    

    Friday, June 19, 2020

    How to run RMAN with debug information


    Tested with Oracle 12.2

    For example, to trace a duplicate database session, I have put the following into a file called "run_duplication.cmd":

    spool trace to run_duplication.trc
    spool log to run_duplication.log
    set echo on;
    debug on;


    connect target sys/password@sourcedb
    connect auxiliary sys/password@auxdb

    run{
    allocate channel c1 type disk;
    allocate channel c2 type disk;
    allocate auxiliary channel aux1 type disk;
    allocate auxiliary channel aux2 type disk;
    configure device type disk parallelism 2;
    DUPLICATE DATABASE "sourcedb.mydomain.com" TO auxdb
    FROM ACTIVE DATABASE
    TABLESPACE TOOLS
    USING COMPRESSED BACKUPSET;
    }
    exit

    When you execute your rman script:

    rman cmdfile='run_duplication.cmd'

    your log and trace files will be generated in your working directory.

    Friday, April 3, 2020

    How to work around hung rman process when duplicating from active database


    Applicable for Oracle 12.1.

    During a "clone from active database" operation, I had a seemingly stuck restore process. It used a lot of time on restoring the controlfiles.

    My clonescript was simple enough:
    connect target /@mysrcdb
    connect auxiliary /@myauxdb
    run{
    allocate channel c1 type disk;
    allocate channel c2 type disk;
    allocate auxiliary channel aux1 type disk;
    allocate auxiliary channel aux2 type disk;
    configure device type disk parallelism 2;
    debug io;
    DUPLICATE TARGET DATABASE TO myauxdb
    FROM ACTIVE DATABASE
    ;
    debug off;
    }
    exit
    

    I noticed that the logfile wouldn't move beoynd this point:
    DBGANY:     No untilscn in effect [17:44:29.012] (krmkicat)
    RMAN-08016: channel aux1: starting datafile backup set restore
    RMAN-08169: channel aux1: using network backup set from service mysrcdb
    RMAN-08021: channel aux1: restoring control file
    

    Solution:
    On the axiliary server, set the parameter
    DISABLE_OOB=on
    

    in your $TNS_ADMIN/sqlnet.ora, and reexecute the duplicate-command again.

    This is caused by a bug mentioned in Doc ID 2073604.1: "RMAN active duplicate hanging on restore control file" found on support.oracle.com

    Monday, August 5, 2019

    How to work around RMAN-08120: warning: archived log not deleted, not yet applied by standby





    Problem scenario:

    * You have a primary database with two standby databases.
    * One of them is functional, the other one is about to be rebuilt.
    * Archived redo logs are piling up on the primary.

    You have already set the log_archive_dest_n to DEFERRED on the primary site, and this will prevent logs from being shipped to the standby database that is about to be rebuilt.

    The reason for this is explained in Doc ID 1380368.1 on My Oracle Support: Oracle will, by default, also consider deferred locations, before deleting archive logs from the primary.

    Solution:
    This can be changed by setting a hidden parameter, _deferred_log_dest_is_valid, from TRUE to FALSE.

    To verify that this worked as intended, I listed a couple of the archivelogs on the primary:

    RMAN> list archivelog sequence between 110880 and 110881;
    
    List of Archived Log Copies for database with db_unique_name PRODDB01
    =====================================================================
    
    Key     Thrd Seq     S Low Time
    ------- ---- ------- - ---------
    173550  1    110880  A 19-JUL-19
            Name: /fra/proddb01/archivelog/2019_07_19/o1_mf_1_110880__2rnqx83t_.arc
    
    173551  1    110881  A 19-JUL-19
            Name: /fra/proddb01/archivelog/2019_07_19/o1_mf_1_110881__2rrrp5ml_.arc
    
    

    Now try to delete them from the primary server:
    RMAN> delete archivelog sequence between 110880 and 110881;
    
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=78 device type=DISK
    allocated channel: ORA_DISK_2
    channel ORA_DISK_2: SID=723 device type=DISK
    allocated channel: ORA_DISK_3
    channel ORA_DISK_3: SID=14 device type=DISK
    allocated channel: ORA_DISK_4
    channel ORA_DISK_4: SID=295 device type=DISK
    RMAN-08120: warning: archived log not deleted, not yet applied by standby
    archived log file name=/fra/proddb01/archivelog/2019_07_19/o1_mf_1_110880__2rnqx83t_.arc thread=1 sequence=110880
    RMAN-08120: warning: archived log not deleted, not yet applied by standby
    archived log file name=/fra/proddb01/archivelog/2019_07_19/o1_mf_1_110881__2rrrp5ml_.arc thread=1 sequence=110881
    
    Oracle doesn't allow you to delete them, since they haven't been applied to all standby destinations.

    To work around this, set the hidden parameter:
    alter system set "_deferred_log_dest_is_valid" = FALSE scope=both;
    

    Try deleting them once more shows that Oracle now allows you to do so:
    RMAN>  delete archivelog sequence between 110880 and 110881;
    
    List of Archived Log Copies for database with db_unique_name proddb01
    =====================================================================
    
    Key     Thrd Seq     S Low Time
    ------- ---- ------- - ---------
    173550  1    110880  A 19-JUL-19
            Name: /fra/proddb01/archivelog/2019_07_19/o1_mf_1_110880__2rnqx83t_.arc
    
    173551  1    110881  A 19-JUL-19
            Name: /fra/proddb01/archivelog/2019_07_19/o1_mf_1_110881__2rrrp5ml_.arc
    
    
    Do you really want to delete the above objects (enter YES or NO)? YES
    deleted archived log
    archived log file name=/fra/proddb01/archivelog/2019_07_19/o1_mf_1_110880__2rnqx83t_.arc RECID=173550 STAMP=1014076575
    Deleted 1 objects
    
    deleted archived log
    archived log file name=/fra/proddb01/archivelog/2019_07_19/o1_mf_1_110881__2rrrp5ml_.arc RECID=173551 STAMP=1014076710
    Deleted 1 objects
    

    Use
    delete noprompt archivelog sequence between 110880 and 110881;
    
    to avoid having to confirm the delete operation.

    Thursday, August 1, 2019

    How to avoid RMAN-10015 when using sql-directives in an RMAN script



    When attempting to use an sql directive in RMAN, you will sometimes be instructed to use the following syntax:
    sql "alter session set events ' '1110 trace name errorstack level 3' '";
    

    This will throw an error upon execution:
    RMAN-03009: failure of sql command on default channel at 08/01/2019 11:20:02
    RMAN-10015: error compiling PL/SQL program
    

    Solution: do not use white spaces between the single quotes.
    This will work:

    sql "alter session set events ''1110 trace name errorstack level 3'' ";
    

    Wednesday, February 6, 2019

    A workaround for RMAN-04014: startup failed: ORA-27069: attempt to do I/O beyond the range of the file after a clone from active database



    My clone from active database failed at the very end of the procedure with
    RMAN-04014: startup failed: ORA-27069: attempt to do I/O beyond the range of the file
    
    This happened after media recovery was complete, and after the oracle instance was started, as can be seen from the log file below:
    RMAN-08181: media recovery complete, elapsed time: 00:00:21
    RMAN-03091: Finished recover at 06-FEB-19
    RMAN-08031: released channel: c1
    RMAN-08031: released channel: c2
    RMAN-08031: released channel: aux1
    RMAN-08031: released channel: aux2
    RMAN-06196: Oracle instance started
    
    Total System Global Area   22749904896 bytes
    
    Fixed Size                     8632928 bytes
    Variable Size               3758097824 bytes
    Database Buffers           18924699648 bytes
    Redo Buffers                  58474496 bytes
    
    RMAN-08161: contents of Memory Script:
    {
       sql clone "alter system set  db_name =
     ''TESTDB01'' comment=
     ''Reset to original value by RMAN'' scope=spfile";
       sql clone "alter system reset  db_unique_name scope=spfile";
    }
    RMAN-08162: executing Memory Script
    
    RMAN-06162: sql statement: alter system set  db_name =  ''TESTDB01'' comment= ''Reset to original value by RMAN'' scope=spfile
    
    RMAN-06162: sql statement: alter system reset  db_unique_name scope=spfile
    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    RMAN-03002: failure of Duplicate Db command at 02/06/2019 04:39:51
    RMAN-05501: aborting duplication of target database
    RMAN-04014: startup failed: ORA-27069: attempt to do I/O beyond the range of the file
    Additional information: 51
    Additional information: 1
    Additional information: 48
    

    This is a bummer, espescially if you've cloned a very large database and have waited for a long time for it to finish.
    There is very little information about this error, but a workaround to save the freshly cloned auxiliary database, is to recreate the controlfile from script

    The workaround is outlined in one of my older posts, but basically it boils down to the following steps:

    1. Go from the source database, generate a "create controlfile" script
    2. Transfer it to the auxiliary server
    3. Edit it to suit the auxiliary database data files
    4. Remove the controlfile files from the auxiliary server
    5. Startup the instance in nomount mode and recreate the controlfile
    6. Open the database with the resetlog option

    Monday, November 19, 2018

    How to use rman to create a backup-based clone on a remote server


    It has been a while since I needed to use this technique, since we rely on storage snapshot clones these days. Sometimes though, a good old-fashioned rman clone based on backups is the only way to solve a problem.
    Since I always avoid working on the production server during cloning, most steps are done logged onto the auxiliary server.

    The method I used is based on "Backup-Based Duplication Without a Target Database and Recovery Catalog Connection"
    Here is what I did:

    1. On the production server, backup of the source database using RMAN:
    rman target / nocatalog
    backup database plus archivelog;

    From this point and onwards, everything is done while logged onto the destination server.

    2. Get the files from the source server:
    ssh testserver
    cd /u05
    mkdir bup
    cd bup
    scp -r prodserver:/u05/flash_recovery_area/PRODDB01/2018_11_16/* .

    3. Prepare the auxiliar pfile. Note that the parameters I use exceeds the actual required parameters.
    But since the auxiliary instance will replace an already existing database instance, which is already tuned and has the correct memory parameters, I choose to include them. Notice also the db_file_name_convert and log_file_name_convert parameters. They control where rman will place the files during the restore process.
    cd $ORACLE_HOME/dbs
    vi inittest1.ora

    Add the following:

    *.db_name='test1'
    *.db_unique_name='test1'
    *.audit_file_dest='/u01/oracle/admin/test1/adump'
    *.audit_trail='DB'
    *.compatible='12.2.0'
    *.control_files='/u02/oradata/test1/control01.ctl','/u04/fra/test1/control02.ctl'
    *.db_block_size=8192
    *.db_file_name_convert='proddb01','test1'
    *.log_file_name_convert='proddb01','test1'
    *.db_recovery_file_dest='/u05/flash_recovery_area/test1'
    *.db_recovery_file_dest_size=1000G
    *.diagnostic_dest='/u01/oracle'
    *.nls_language='NORWEGIAN'
    *.nls_territory='NORWAY'
    *.open_cursors=300
    *.optimizer_adaptive_plans=FALSE
    *.optimizer_dynamic_sampling=0
    *.optimizer_mode='ALL_ROWS'
    *.pga_aggregate_target=7222M
    *.processes=1500
    *.remote_login_passwordfile='EXCLUSIVE'
    *.sga_target=21696M
    *.shared_pool_size=2624M
    *.streams_pool_size=256M
    *.undo_tablespace='UNDOTBS1'
    *.diagnostic_dest=/u01/oracle

    4. I prefer using an spfile over a pfile. Therefore I take the time here to create an spfile:
    sqlplus / as sysdba
    startup nomount pfile=inittest1.ora
    create spfile from pfile;

    5. Startup the auxiliary instance in nomount-mode using the spfile:
    shutdown abort
    startup nomount

    6. Start the duplication process:
    rman auxiliary /
    duplicate database to test1
    backup location '/u05/fra/bup';

    RMAN went through the normal restore, name switching and recovery phases. Finally, the database was opened with the resetlog option.

    Thursday, November 19, 2015

    How to handle RMAN DUPLICATE: Errors in krbm_getDupCopy during duplicate

    I recently cloned a 11.2.0.4 database using tape backups as source.

    The clone failed right before "open resetlogs". In other words, the database was restored and recovered, but DUPLICATE didn't succeed in opening it. So I did it manually, by recreating the controlfile using a template from the source database, and adjusted for the file layout on the auxiliary. The database opened nicely.

    However, later during the day, the application team found that a new clone was needed due to some synch issue caused by a third-party replication tool. In other words, the entire database had to be copied from the latest version of our incremental level 1 backups and restored all over again.

    No problem.

    As usual, I started by saving away the spfile, password file and the block change tracking file located in $ORACLE_HOME/dbs. Following that, I mounted and dropped the database. All the files were automatically removed from disk by Oracle, and the instance terminated, as expected.

    As I started to clone, I noticed some new messages in the alert log of the auxiliary database:

    RMAN DUPLICATE: Errors in krbm_getDupCopy
    Errors in file /u01/oracle/product/11204/admin/testdb01/diag/rdbms/testdb01/testdb01/trace/testdb01_ora_22544486.trc:
    ORA-19625: error identifying file /u02/oradata/testdb01/system01.dbf
    ORA-27037: unable to obtain file status
    IBM AIX RISC System/6000 Error: 2: No such file or directory
    Additional information: 3
    

    This error was repeated for every data file in the database and appended to the same trace file as in the message above.

    I found the errors worth checking.

    Fortunately, these errors can be ignored. MOS Doc ID 1476641.1 "RMAN DUPLICATE: Errors In Krbm_getDupCopy found in alert.log" explains

    "This happens after a previous failed duplicate trial, if the files copied to auxiliary destination have been deleted."


    and


    "The messages in alert.log indicate that rman cannot use a previous datafile copy for those files and that the files must be copied again to auxiliary destination. Duplicate checks if there are datafile copies already created by a duplicate to avoid restoring the files again.


    So this is a part of Oracle restore optimization concept:

    "...from 11.2 RMAN always checks if the file copy exists at destination hosts to avoid copying it again"


    Great, but how does RMAN do that?

    "If duplicate fails between the first restore/copy and the zeroing of the dbid, a second duplicate will find the _rm_dup_@.dat file and the clone default channel will read it into memory, once determined the name of the datafilecopy to be created by the second duplicate it is compared with the existing datafilecopy from the previous duplicate. If the datafilecopy still exists and matches the vital information of the datafile (file number, database id, creation scn, database name) and its checkpoint scn is behind the until scn then the datafilecopy can be used by this new duplicate and restore/copy isnot necessary"


    The solution is simple:

    "As files have already been deleted from auxiliary destination, ignore those messages."

    I could also have followed the following advice:

    "If you don't want to see those messages in alert.log but datafiles have already been deleted, on Auxiliary host, delete the file $ORACLE_HOME/dbs/_rm_dup_.dat where dup_db is the name of the clone instance."


    Tuesday, July 21, 2015

    How to solve ORA-17628: Oracle error 19505 returned by remote Oracle server during clone from active database

    After some time, my RMAN "duplicate from active database" script threw the error below:

    RMAN-03009: failure of backup command on c1 channel at 07/21/2015 12:13:15
    ORA-17627: ORA-12577: Message 12577 not found;  product=RDBMS; facility=ORA
    RMAN-12019: continuing other job steps, job failed will not be re-run
    

    When I checked the alert log, it was clear

    IBM AIX RISC System/6000 Error: 28: No space left on device
    

    Solution: obvious.

    How to solve ORA-17627: ORA-12154: TNS:could not resolve the connect identifier specified during clone from active database

    During an attempt to clone from active database, the following error was thrown:

    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    RMAN-03002: failure of Duplicate Db command at 07/21/2015 10:06:15
    RMAN-05501: aborting duplication of target database
    RMAN-03015: error occurred in stored script Memory Script
    RMAN-03009: failure of backup command on c1 channel at 07/21/2015 10:06:15
    ORA-17629: Cannot connect to the remote database server
    ORA-17627: ORA-12154: TNS:could not resolve the connect identifier specified
    ORA-17629: Cannot connect to the remote database server
    

    This seems pretty obvious, since the error stack states that there is a connection problem, but it's easy to overlook if you run your command from the destination server, and like in my case, checked that all connections work, that the passwords are identical etc etc.

    The solution is simply to add the source and the destination connect descriptor in the $TNS_ADMIN/tnsnames.ora files on both the source and auxiliary server.


    See my other article on active database cloning for a more detailed description of the required setup.

    Tuesday, February 17, 2015

    How cloning from backup became a little more secure in Oracle 11g

    From version 11, Oracle supplied another method to use during cloning, namely "Backup-based duplication without a target connection".

    From the Oracle 11g Documentation:

    RMAN can perform the duplication in any of the following supported modes:
    
    1.       Active duplication
    2.       Backup-based duplication without a target connection
    3.       Backup-based duplication with a target connection
    4.       Backup-based duplication without connection to target and recovery catalog
    

    If you choose method number 2, you need to use the following syntax:

    DUPLICATE DATABASE <target database> TO <auxiliary database>;
    

    In the 10g documentation you will need to use the "target" keyword
    DUPLICATE TARGET DATABASE TO <auxiliary database>;
    

    Oracle points out:
    «This mode is useful when the target database is not available or a connection to it is not desirable». 
    

    In other words, cloning has become more secure since version 11g, where we can totally avoid connecting to the target database, which is often in production, during cloning from backup.

    An example of an incident where a connection to the target could potentially jeopardize production, is when you are scripting jobs for RMAN and accidently issue "shutdown", which will bring down your target database and not your auxiliary database, which was what you intended.

    For the record, the keyword "target" is an unfortunate choice of syntax during cloning. In IT, a "target" would generally be interpreted as a synonym for "destination", which is where you want to clone *TO*.

    Wednesday, January 21, 2015

    How to relocate the block change tracking file

    To relocate the block change tracking file you have two options:

    1) shutdown database, mount database, update control file, open database
    sqlplus / as sysdba
    shutdown immediate
    exit
     -- Move the block change tracking file to the new location using the appropriate os utility. --
    sqlplus / as sysdba
    startup mount
    ALTER DATABASE RENAME FILE 'ora_home/dbs/change_trk.f' TO '/new_disk/change_trk.f'; 
    ALTER DATABASE OPEN;
    

    OR

    2) disable and re-enable block change tracking, and point to the new location when re-enabling.
    ALTER DATABASE DISABLE BLOCK CHANGE TRACKING;
    ALTER DATABASE ENABLE BLOCK CHANGE TRACKING USING FILE 'new_location';
    

    See Section "4.4.4.3 Moving the Change Tracking File" in the Oracle Documentation regarding this feature.

    Monday, December 15, 2014

    How to solve "ORA-06502: PL/SQL: numeric or value error" when cloning a database

    While attempting to clone an Oracle 11gR2 production database, Oracle returned "ORA-06502: PL/SQL: numeric or value error".

    The clone proceeded nicely up until RMAN attempted to set the db_unique_name. After this point, the clone failed, see error message output below:
    Oracle instance started
    
    Total System Global Area    8551575552 bytes
    
    Fixed Size                     2263488 bytes
    Variable Size               1107297856 bytes
    Database Buffers            7415529472 bytes
    Redo Buffers                  26484736 bytes
    
    contents of Memory Script:
    {
       sql clone "alter system set  db_name =
       ''testdb01'' comment=
       ''Reset to original value by RMAN'' scope=spfile";
       sql clone "alter system reset  db_unique_name scope=spfile";
       shutdown clone immediate;
    }
    
    executing Memory Script
    
    sql statement: alter system set  db_name =  ''testdb01'' comment= ''Reset to original value by RMAN'' scope=spfile
    
    sql statement: alter system reset  db_unique_name scope=spfile
    
    Oracle instance shut down
    
    
    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    RMAN-03002: failure of Duplicate Db command at 12/11/2014 13:20:45
    RMAN-05501: aborting duplication of target database
    RMAN-06136: ORACLE error from auxiliary database: ORA-06502: PL/SQL: numeric or value error
    
    The trace file also adds the following line to the previous lines:
    ORA-06512: at "SYS.X$DBMS_BACKUP_RESTORE", line 5567
    

    The culprit seems to be that my production database and the auxiliary database had at some point in the past been part of a Data Guard setup.
    By looking at the registered databases in the recovery catalog, this could indeed be verified:

    rman target / catalog uid/pwd@rmancat 
    
    Recovery Manager: Release 11.2.0.4.0 - Production on Mon Dec 15 11:33:55 2014
    
    Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
     
    connected to target database: proddb01 (DBID=3046916437)
    connected to recovery catalog database
    
    RMAN> set echo on;
    2> list db_unique_name of database;
    4> exit
    echo set on
    List of Databases
    
    DB Key  DB Name  DB ID            Database Role    Db_unique_name
    ------- ------- ----------------- ---------------  ------------------
    133048143 proddb01     3046916437       PRIMARY          proddb01
    133048143 proddb01     3046916437       STANDBY          testdb01
    

    Solution:
    In this case, this old association to an obsolete standby database could be removed altogether.
    I therefore unregistered the unique database from the recovery catalog:
    RMAN> unregister db_unique_name 'testdb01';
    
    database db_unique_name is "testdb01", db_name is "proddb01" and DBID is 3046916437
    
    Want to unregister the database with target db_unique_name (enter YES or NO)? yes
    database with db_unique_name testdb01 unregistered from the recovery catalog
    

    After this was done, I restarted the clone, and the error was no longer thrown and the script continued as expected.

    If you are in a slightly different position, and have several primary databases with the same unique name, you need a different approach. Consider the following example:
    RMAN> list db_unique_name of database;
    
    List of Databases
    
    DB Key  DB Name  DB ID            Database Role    Db_unique_name
    ------- ------- ----------------- ---------------  ------------------
    376254740 PDB01    3895530761       PRIMARY          PDB01
    380374509 PDB01    3902464629       PRIMARY          PDB01
    383100641 PDB01    3906391689       PRIMARY          PDB01
    
    RMAN> SET DBID 3895530761;
    
    executing command: SET DBID
    database name is "PDB01" and DBID is 3895530761
    
    RMAN> UNREGISTER DB_UNIQUE_NAME PDB01;
    
    database db_unique_name is "PDB01", db_name is "PDB01" and DBID is 3895530761
    
    Want to unregister the database with target db_unique_name (enter YES or NO)? yes
    
    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    RMAN-06004: ORACLE error from recovery catalog database: RMAN-20244: can not change currently connected database db_unique_name
    RMAN> exit
    

    Solution is to still explicitly set DBID, but then execute "unregister database" instead, like this:

    RMAN> unregister database;
    
    database name is "PDB01" and DBID is 3895530761
    
    Do you really want to unregister the database (enter YES or NO)? yes
    
    database unregistered from the recovery catalog
    
    RMAN> SET DBID 3902464629;
    executing command: SET DBID
    database name is "PDB01" and DBID is 3902464629 
    
    RMAN> unregister database;
    database name is "PDB01" and DBID is 3902464629
    
    Do you really want to unregister the database (enter YES or NO)? YES
    
    RMAN> list db_unique_name of database;
    
    List of Databases
    
    DB Key  DB Name  DB ID            Database Role    Db_unique_name
    
    ------- ------- ----------------- ---------------  ------------------
    383100641 PDB01    3906391689       PRIMARY          PDB01
    

    We are left with only one unique db name, which is what we want.

    Thursday, November 20, 2014

    How to flash the database back to a guaranteed restore point

    After a week of testing their new release, my customer wanted their database reset to the state it was previously in.
    Earlier this week, I created a restore point of type GUARANTEED FLASHBACK.

    Here is how I restored a database using flashback database technology. It was, as expected, very quick. Only a few seconds to flash back 14 GB of changes.

    Connect to Recovery Manager:
    oracle@myserver:[TESTDB01]# rman target / catalog uid/pwd@rmancat
    
    Recovery Manager: Release 11.2.0.4.0 - Production on Thu Nov 20 16:06:44 2014
    
    connected to target database: TESTDB01 (DBID=411134280, not open)
    connected to recovery catalog database
    
    RMAN> list restore point all;
    
    SCN              RSP Time  Type          Time         Name
    ---------------- --------- ----------   ---------     ----
    153050263689                GUARANTEED  17-NOV-14     AKSEPT_TEST_START
    

    Start the flashback operation:
    RMAN> FLASHBACK DATABASE TO RESTORE POINT AKSEPT_TEST_START;
    Starting flashback at 20-NOV-14
    allocated channel: ORA_SBT_TAPE_1
    channel ORA_SBT_TAPE_1: SID=585 device type=SBT_TAPE
    channel ORA_SBT_TAPE_1: Data Protection for Oracle: version 6.3.0.0
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=683 device type=DISK
    
    starting media recovery
    
    archived log for thread 1 with sequence 4651 is already on disk as file /u04/fast_recovery_area/TESTDB01/archivelog/2014_11_17/o1_mf_1_4651_b6m91zog_.arc
    media recovery complete, elapsed time: 00:00:01
    Finished flashback at 20-NOV-14 
    
    RMAN> ALTER DATABASE OPEN RESETLOGS;
    
    database opened
    new incarnation of database registered in recovery catalog
    starting full resync of recovery catalog
    full resync complete
    RMAN> exit
    


    -- all done --

    From alert.log:
    Completed: ALTER DATABASE   MOUNT
    Thu Nov 20 16:10:37 2014
    alter database recover datafile list clear
    Completed: alter database recover datafile list clear
    RMAN flashback database to before scn 153050263690 in incarnation 2
    Flashback Restore Start
    Thu Nov 20 16:12:11 2014
    Flashback Restore Complete
    Flashback Media Recovery Start
    started logmerger process
    Parallel Media Recovery started with 8 slaves
    Thu Nov 20 16:12:22 2014
    Flashback Media Recovery Log /u04/fast_recovery_area/TESTDB01/archivelog/2014_11_17/o1_mf_1_4651_b6m91zog_.arc
    Thu Nov 20 16:12:22 2014
    Incomplete Recovery applied until change 153050263690 time 11/17/2014 07:53:33
    Flashback Media Recovery Complete
    Completed: RMAN flashback database to before scn 153050263690 in incarnation 2
    Thu Nov 20 16:13:29 2014
    alter database open resetlogs
    RESETLOGS after incomplete recovery UNTIL CHANGE 153050263690
    Archived Log entry 4827 added for thread 1 sequence 4826 ID 0x18811648 dest 1:
    Archived Log entry 4828 added for thread 1 sequence 4823 ID 0x18811648 dest 1:
    Archived Log entry 4829 added for thread 1 sequence 4822 ID 0x18811648 dest 1:
    Archived Log entry 4830 added for thread 1 sequence 4827 ID 0x18811648 dest 1:
    Archived Log entry 4831 added for thread 1 sequence 4824 ID 0x18811648 dest 1:
    Archived Log entry 4832 added for thread 1 sequence 4825 ID 0x18811648 dest 1:
    Clearing online redo logfile 1 /u03/oradata/TESTDB01/redo01.log
    Clearing online log 1 of thread 1 sequence number 4826
    Clearing online redo logfile 1 complete
    Clearing online redo logfile 2 /u03/oradata/TESTDB01/redo02.log
    Clearing online log 2 of thread 1 sequence number 4823
    Clearing online redo logfile 2 complete
    Clearing online redo logfile 3 /u03/oradata/TESTDB01/redo03.log
    Clearing online log 3 of thread 1 sequence number 4822
    Thu Nov 20 16:13:41 2014
    Clearing online redo logfile 3 complete
    Clearing online redo logfile 4 /u03/oradata/TESTDB01/redo04.log
    Clearing online log 4 of thread 1 sequence number 4827
    Clearing online redo logfile 4 complete
    Clearing online redo logfile 5 /u03/oradata/TESTDB01/redo05.log
    Clearing online log 5 of thread 1 sequence number 4824
    Clearing online redo logfile 5 complete
    Clearing online redo logfile 6 /u03/oradata/TESTDB01/redo06.log
    Clearing online log 6 of thread 1 sequence number 4825
    Thu Nov 20 16:13:52 2014
    Clearing online redo logfile 6 complete
    Resetting resetlogs activation ID 411113032 (0x18811648)
    Online log /u03/oradata/TESTDB01/redo01.log: Thread 1 Group 1 was previously cleared
    Online log /u03/oradata/TESTDB01/redo02.log: Thread 1 Group 2 was previously cleared
    Online log /u03/oradata/TESTDB01/redo03.log: Thread 1 Group 3 was previously cleared
    Online log /u03/oradata/TESTDB01/redo04.log: Thread 1 Group 4 was previously cleared
    Online log /u03/oradata/TESTDB01/redo05.log: Thread 1 Group 5 was previously cleared
    Online log /u03/oradata/TESTDB01/redo06.log: Thread 1 Group 6 was previously cleared
    Thu Nov 20 16:13:52 2014
    Setting recovery target incarnation to 3