Showing posts with label RMAN. Show all posts
Showing posts with label RMAN. Show all posts

Tuesday, October 3, 2023

What to do if your flash recovery area is filled with "foreign archive logs"?

One of my databases came to a halt when the flash recovery area filled completely.

It wasn't a problem with space preassure, but merely a condition that made the flash recovery area *seem* full, when it really wasn't.

To clear the archiver stuck condition is easy enough, simply increase the value of db_recovery_file_dest_size, but how to fix the source of the problem?

My flash recovery area was, at the time, filled up with foreign archive logs:
SYS@CDB$ROOT SQL> select * from v$flash_recovery_area_usage;

FILE_TYPE               PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES     CON_ID
----------------------- ------------------ ------------------------- --------------- ----------
CONTROL FILE                             0                         0               0          0
REDO LOG                                 0                         0               0          0
ARCHIVED LOG                         14.04                         0             235          0
BACKUP PIECE                           .06                         0              44          0
IMAGE COPY                               0                         0               0          0
FLASHBACK LOG                            0                         0               0          0
FOREIGN ARCHIVED LOG                  85.8                         0            1558          0
AUXILIARY DATAFILE COPY                  0                         0               0          0

8 rows selected.
To fix this, log on to the database using rman:
rman target / nocatalog
Here you can list the foreign archivelogs:
list foreign archivelogs all;
They were all from may 2023, and they have ended up here since the database was cloned using storage snapshots.

To clear them out of the flash recovery area, you need to first crosscheck them:
RMAN> crosscheck foreign archivelog all;
Then, delete them:
RMAN> delete noprompt foreign archivelog all ;
The crosscheck is important. Without it, Oracle cannot delete the entries from the controlfile, which means that tha v$flash_recovery_area will still be reported as full.
The flash recovery area is not full anymore:
SYS@CDB$ROOT SQL> select * from v$flash_recovery_area_usage;

FILE_TYPE               PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES     CON_ID
----------------------- ------------------ ------------------------- --------------- ----------
CONTROL FILE                             0                         0               0          0
REDO LOG                                 0                         0               0          0
ARCHIVED LOG                         18.21                         0             280          0
BACKUP PIECE                           .06                         0              44          0
IMAGE COPY                               0                         0               0          0
FLASHBACK LOG                            0                         0               0          0
FOREIGN ARCHIVED LOG                     0                         0               0          0
AUXILIARY DATAFILE COPY                  0                         0               0          0

8 rows selected.

Documented in Doc ID 1617965.1 "What commands may be used to remove foreign archivelog files?" at Oracle Support.

Thursday, February 3, 2022

Observation: rman saves files according to end date of backup

I noticed the following: An archivelog file backed up at 1.01.2022 23:57:18 will NOT be saved in the folder for 31.01.2022. Instead, it will be saved in the folder for 01.02.2022.
Output from RMAN:
 list archivelog from time '31.01.2022' until time '01.02.2022';
 
Output (excerpt)
450706  1    450701  A 31.01.2022 23:56:46
        Name: /u04/fra/proddb01/archivelog/2022_01_31/o1_mf_1_450701__vwojnvs6_.arc

450707  1    450702  A 31.01.2022 23:57:16
        Name: /u04/fra/proddb01/archivelog/2022_01_31/o1_mf_1_450702__vwokkx0p_.arc

450708  1    450703  A 31.01.2022 23:57:18
        Name: /u04/fra/proddb01/archivelog/2022_02_01/o1_mf_1_450703__vx4cmycs_.arc

The file /u04/fra/proddb01/archivelog/2022_02_01/o1_mf_1_450703__vx4cmycs_.arc has the timestamp Feb 1 00:05
So in this case, the last file generated on 31.01 actually ended up in the folder for files generated on the 01.02

Tuesday, April 13, 2021

How to solve "RMAN-04014: startup failed: ORA-01261: Parameter db_recovery_file_dest destination string cannot be translated" during cloning

During an attempt to clone for active database I had created pfile with my desired value for the parameter db_recovery_file_dest, which was /fradata. Excerpt from init.ora used to start the instance (nomount):
cat initproddb01.ora |grep db_recovery_file_dest

*.db_recovery_file_dest='/fradata'
This directory exists on my auxilliary server and is indeed writable by the oracle software installation owner (for most installation, this user is called "oracle") After having started the instance, and executed the clone script:
DUPLICATE TARGET DATABASE
FOR STANDBY
FROM ACTIVE DATABASE
DORECOVER
SPFILE
SET DB_UNIQUE_NAME='STBY01' COMMENT 'Stanby db 1'
USING COMPRESSED BACKUPSET
NOFILENAMECHECK;
This message appears:
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 04/13/2021 08:07:31
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
RMAN-04014: startup failed: ORA-01261: Parameter db_recovery_file_dest destination string cannot be translated
ORA-01262: Stat failed on a file destination directory
Linux-x86_64 Error: 2: No such file or directory
Cause: The spfile that is created contains the target's db_recovery_file_dest setting, not the auxiliary's:
strings spfileproddb01.ora | grep db_recovery_file_dest
*.db_recovery_file_dest='/FRA' <--- this is inherited from the target database, and doesn't exist on the auxiliary server
Solution: Add a new spfile directive to your clone script:
SET DB_RECOVERY_FILE_DEST='/fradata'
and rerun.

Wednesday, December 16, 2020

How to fix error SYS.DBMS_BACKUP_RESTORE version 18.11.00.00 in TARGET database is not current

Even if you're not using a recovery catalog, you may encounter this issue if you have recently patched your database. Simply by connecting to the target database as sysdba, you can verify that you need to update your internal packages:
oracle@myserver.mydomain.com:[proddb01]# rman target / nocatalog

Recovery Manager: Release 18.0.0.0.0 - Production on Wed Dec 16 13:51:43 2020
Version 18.12.0.0.0

Copyright (c) 1982, 2018, Oracle and/or its affiliates.  All rights reserved.

PL/SQL package SYS.DBMS_BACKUP_RESTORE version 18.11.00.00 in TARGET database is not current
PL/SQL package SYS.DBMS_RCVMAN version 18.11.00.00 in TARGET database is not current
connected to target database: PRODDB01 (DBID=1234567890)
using target database control file instead of recovery catalog
Solution: Follow Doc ID 888818.1: "Rman 06190 Connecting to target database after upgrade" In short, connect to the database as sysdba, and run the following scripts:
SQL> @$ORACLE_HOME/rdbms/admin/dbmsrman.sql
SQL> @$ORACLE_HOME/rdbms/admin/prvtrmns.plb   
SQL> @$ORACLE_HOME/rdbms/admin/dbmsbkrs.sql 
SQL> @$ORACLE_HOME/rdbms/admin/prvtbkrs.plb

Tuesday, July 21, 2020

How to fix RMAN-04006 when cloning from active database




Background:
During an attempt to clone database using active duplication, I received an error when testing my connectivity to the target and auxiliary server. I normally do a testrun first, to see if I can connect without problems, before I start my cloning script.

My script to test the connection is simple and called "testconnection.cmd".
The connections are made through a wallet, so that you do not have to expose your passwords in any scripts or any shell prompts. See my previous post for details about setting up a wallet.
connect target /@proddb01.oric.no
connect auxiliary /@testdb01.oric.no
run{
allocate channel c1 type disk;
allocate channel c2 type disk;
allocate auxiliary channel aux1 type disk;
allocate auxiliary channel aux2 type disk;
configure device type disk parallelism 2;
}
exit

Execute it as follows:
rman cmdfile=testconnection.cmd

connected to target database: prodb01 (DBID=2078894010, not open)
connected to auxiliary database (not started)

RMAN-04006: error from auxiliary database: ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist

Cause:
Incorrect password in the auxiliary database

Solution:
Copy the password file from your target database server to your auxiliary database server and retry the operation.

When all is good, the output from the auxiliary database should be
connected to auxiliary database: testdb01 (not mounted)

Friday, June 19, 2020

How to run RMAN with debug information


Tested with Oracle 12.2

For example, to trace a duplicate database session, I have put the following into a file called "run_duplication.cmd":

spool trace to run_duplication.trc
spool log to run_duplication.log
set echo on;
debug on;


connect target sys/password@sourcedb
connect auxiliary sys/password@auxdb

run{
allocate channel c1 type disk;
allocate channel c2 type disk;
allocate auxiliary channel aux1 type disk;
allocate auxiliary channel aux2 type disk;
configure device type disk parallelism 2;
DUPLICATE DATABASE "sourcedb.mydomain.com" TO auxdb
FROM ACTIVE DATABASE
TABLESPACE TOOLS
USING COMPRESSED BACKUPSET;
}
exit

When you execute your rman script:

rman cmdfile='run_duplication.cmd'

your log and trace files will be generated in your working directory.

Friday, April 3, 2020

How to work around hung rman process when duplicating from active database


Applicable for Oracle 12.1.

During a "clone from active database" operation, I had a seemingly stuck restore process. It used a lot of time on restoring the controlfiles.

My clonescript was simple enough:
connect target /@mysrcdb
connect auxiliary /@myauxdb
run{
allocate channel c1 type disk;
allocate channel c2 type disk;
allocate auxiliary channel aux1 type disk;
allocate auxiliary channel aux2 type disk;
configure device type disk parallelism 2;
debug io;
DUPLICATE TARGET DATABASE TO myauxdb
FROM ACTIVE DATABASE
;
debug off;
}
exit

I noticed that the logfile wouldn't move beoynd this point:
DBGANY:     No untilscn in effect [17:44:29.012] (krmkicat)
RMAN-08016: channel aux1: starting datafile backup set restore
RMAN-08169: channel aux1: using network backup set from service mysrcdb
RMAN-08021: channel aux1: restoring control file

Solution:
On the axiliary server, set the parameter
DISABLE_OOB=on

in your $TNS_ADMIN/sqlnet.ora, and reexecute the duplicate-command again.

This is caused by a bug mentioned in Doc ID 2073604.1: "RMAN active duplicate hanging on restore control file" found on support.oracle.com

Monday, August 5, 2019

How to work around RMAN-08120: warning: archived log not deleted, not yet applied by standby





Problem scenario:

* You have a primary database with two standby databases.
* One of them is functional, the other one is about to be rebuilt.
* Archived redo logs are piling up on the primary.

You have already set the log_archive_dest_n to DEFERRED on the primary site, and this will prevent logs from being shipped to the standby database that is about to be rebuilt.

The reason for this is explained in Doc ID 1380368.1 on My Oracle Support: Oracle will, by default, also consider deferred locations, before deleting archive logs from the primary.

Solution:
This can be changed by setting a hidden parameter, _deferred_log_dest_is_valid, from TRUE to FALSE.

To verify that this worked as intended, I listed a couple of the archivelogs on the primary:

RMAN> list archivelog sequence between 110880 and 110881;

List of Archived Log Copies for database with db_unique_name PRODDB01
=====================================================================

Key     Thrd Seq     S Low Time
------- ---- ------- - ---------
173550  1    110880  A 19-JUL-19
        Name: /fra/proddb01/archivelog/2019_07_19/o1_mf_1_110880__2rnqx83t_.arc

173551  1    110881  A 19-JUL-19
        Name: /fra/proddb01/archivelog/2019_07_19/o1_mf_1_110881__2rrrp5ml_.arc


Now try to delete them from the primary server:
RMAN> delete archivelog sequence between 110880 and 110881;

allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=78 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=723 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=14 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=295 device type=DISK
RMAN-08120: warning: archived log not deleted, not yet applied by standby
archived log file name=/fra/proddb01/archivelog/2019_07_19/o1_mf_1_110880__2rnqx83t_.arc thread=1 sequence=110880
RMAN-08120: warning: archived log not deleted, not yet applied by standby
archived log file name=/fra/proddb01/archivelog/2019_07_19/o1_mf_1_110881__2rrrp5ml_.arc thread=1 sequence=110881
Oracle doesn't allow you to delete them, since they haven't been applied to all standby destinations.

To work around this, set the hidden parameter:
alter system set "_deferred_log_dest_is_valid" = FALSE scope=both;

Try deleting them once more shows that Oracle now allows you to do so:
RMAN>  delete archivelog sequence between 110880 and 110881;

List of Archived Log Copies for database with db_unique_name proddb01
=====================================================================

Key     Thrd Seq     S Low Time
------- ---- ------- - ---------
173550  1    110880  A 19-JUL-19
        Name: /fra/proddb01/archivelog/2019_07_19/o1_mf_1_110880__2rnqx83t_.arc

173551  1    110881  A 19-JUL-19
        Name: /fra/proddb01/archivelog/2019_07_19/o1_mf_1_110881__2rrrp5ml_.arc


Do you really want to delete the above objects (enter YES or NO)? YES
deleted archived log
archived log file name=/fra/proddb01/archivelog/2019_07_19/o1_mf_1_110880__2rnqx83t_.arc RECID=173550 STAMP=1014076575
Deleted 1 objects

deleted archived log
archived log file name=/fra/proddb01/archivelog/2019_07_19/o1_mf_1_110881__2rrrp5ml_.arc RECID=173551 STAMP=1014076710
Deleted 1 objects

Use
delete noprompt archivelog sequence between 110880 and 110881;
to avoid having to confirm the delete operation.

Thursday, August 1, 2019

How to avoid RMAN-10015 when using sql-directives in an RMAN script



When attempting to use an sql directive in RMAN, you will sometimes be instructed to use the following syntax:
sql "alter session set events ' '1110 trace name errorstack level 3' '";

This will throw an error upon execution:
RMAN-03009: failure of sql command on default channel at 08/01/2019 11:20:02
RMAN-10015: error compiling PL/SQL program

Solution: do not use white spaces between the single quotes.
This will work:

sql "alter session set events ''1110 trace name errorstack level 3'' ";

Wednesday, February 6, 2019

A workaround for RMAN-04014: startup failed: ORA-27069: attempt to do I/O beyond the range of the file after a clone from active database



My clone from active database failed at the very end of the procedure with
RMAN-04014: startup failed: ORA-27069: attempt to do I/O beyond the range of the file
This happened after media recovery was complete, and after the oracle instance was started, as can be seen from the log file below:
RMAN-08181: media recovery complete, elapsed time: 00:00:21
RMAN-03091: Finished recover at 06-FEB-19
RMAN-08031: released channel: c1
RMAN-08031: released channel: c2
RMAN-08031: released channel: aux1
RMAN-08031: released channel: aux2
RMAN-06196: Oracle instance started

Total System Global Area   22749904896 bytes

Fixed Size                     8632928 bytes
Variable Size               3758097824 bytes
Database Buffers           18924699648 bytes
Redo Buffers                  58474496 bytes

RMAN-08161: contents of Memory Script:
{
   sql clone "alter system set  db_name =
 ''TESTDB01'' comment=
 ''Reset to original value by RMAN'' scope=spfile";
   sql clone "alter system reset  db_unique_name scope=spfile";
}
RMAN-08162: executing Memory Script

RMAN-06162: sql statement: alter system set  db_name =  ''TESTDB01'' comment= ''Reset to original value by RMAN'' scope=spfile

RMAN-06162: sql statement: alter system reset  db_unique_name scope=spfile
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 02/06/2019 04:39:51
RMAN-05501: aborting duplication of target database
RMAN-04014: startup failed: ORA-27069: attempt to do I/O beyond the range of the file
Additional information: 51
Additional information: 1
Additional information: 48

This is a bummer, espescially if you've cloned a very large database and have waited for a long time for it to finish.
There is very little information about this error, but a workaround to save the freshly cloned auxiliary database, is to recreate the controlfile from script

The workaround is outlined in one of my older posts, but basically it boils down to the following steps:

1. Go from the source database, generate a "create controlfile" script
2. Transfer it to the auxiliary server
3. Edit it to suit the auxiliary database data files
4. Remove the controlfile files from the auxiliary server
5. Startup the instance in nomount mode and recreate the controlfile
6. Open the database with the resetlog option

Monday, November 19, 2018

How to use rman to create a backup-based clone on a remote server


It has been a while since I needed to use this technique, since we rely on storage snapshot clones these days. Sometimes though, a good old-fashioned rman clone based on backups is the only way to solve a problem.
Since I always avoid working on the production server during cloning, most steps are done logged onto the auxiliary server.

The method I used is based on "Backup-Based Duplication Without a Target Database and Recovery Catalog Connection"
Here is what I did:

1. On the production server, backup of the source database using RMAN:
rman target / nocatalog
backup database plus archivelog;

From this point and onwards, everything is done while logged onto the destination server.

2. Get the files from the source server:
ssh testserver
cd /u05
mkdir bup
cd bup
scp -r prodserver:/u05/flash_recovery_area/PRODDB01/2018_11_16/* .

3. Prepare the auxiliar pfile. Note that the parameters I use exceeds the actual required parameters.
But since the auxiliary instance will replace an already existing database instance, which is already tuned and has the correct memory parameters, I choose to include them. Notice also the db_file_name_convert and log_file_name_convert parameters. They control where rman will place the files during the restore process.
cd $ORACLE_HOME/dbs
vi inittest1.ora

Add the following:

*.db_name='test1'
*.db_unique_name='test1'
*.audit_file_dest='/u01/oracle/admin/test1/adump'
*.audit_trail='DB'
*.compatible='12.2.0'
*.control_files='/u02/oradata/test1/control01.ctl','/u04/fra/test1/control02.ctl'
*.db_block_size=8192
*.db_file_name_convert='proddb01','test1'
*.log_file_name_convert='proddb01','test1'
*.db_recovery_file_dest='/u05/flash_recovery_area/test1'
*.db_recovery_file_dest_size=1000G
*.diagnostic_dest='/u01/oracle'
*.nls_language='NORWEGIAN'
*.nls_territory='NORWAY'
*.open_cursors=300
*.optimizer_adaptive_plans=FALSE
*.optimizer_dynamic_sampling=0
*.optimizer_mode='ALL_ROWS'
*.pga_aggregate_target=7222M
*.processes=1500
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=21696M
*.shared_pool_size=2624M
*.streams_pool_size=256M
*.undo_tablespace='UNDOTBS1'
*.diagnostic_dest=/u01/oracle

4. I prefer using an spfile over a pfile. Therefore I take the time here to create an spfile:
sqlplus / as sysdba
startup nomount pfile=inittest1.ora
create spfile from pfile;

5. Startup the auxiliary instance in nomount-mode using the spfile:
shutdown abort
startup nomount

6. Start the duplication process:
rman auxiliary /
duplicate database to test1
backup location '/u05/fra/bup';

RMAN went through the normal restore, name switching and recovery phases. Finally, the database was opened with the resetlog option.

Thursday, November 19, 2015

How to handle RMAN DUPLICATE: Errors in krbm_getDupCopy during duplicate

I recently cloned a 11.2.0.4 database using tape backups as source.

The clone failed right before "open resetlogs". In other words, the database was restored and recovered, but DUPLICATE didn't succeed in opening it. So I did it manually, by recreating the controlfile using a template from the source database, and adjusted for the file layout on the auxiliary. The database opened nicely.

However, later during the day, the application team found that a new clone was needed due to some synch issue caused by a third-party replication tool. In other words, the entire database had to be copied from the latest version of our incremental level 1 backups and restored all over again.

No problem.

As usual, I started by saving away the spfile, password file and the block change tracking file located in $ORACLE_HOME/dbs. Following that, I mounted and dropped the database. All the files were automatically removed from disk by Oracle, and the instance terminated, as expected.

As I started to clone, I noticed some new messages in the alert log of the auxiliary database:

RMAN DUPLICATE: Errors in krbm_getDupCopy
Errors in file /u01/oracle/product/11204/admin/testdb01/diag/rdbms/testdb01/testdb01/trace/testdb01_ora_22544486.trc:
ORA-19625: error identifying file /u02/oradata/testdb01/system01.dbf
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or directory
Additional information: 3

This error was repeated for every data file in the database and appended to the same trace file as in the message above.

I found the errors worth checking.

Fortunately, these errors can be ignored. MOS Doc ID 1476641.1 "RMAN DUPLICATE: Errors In Krbm_getDupCopy found in alert.log" explains

"This happens after a previous failed duplicate trial, if the files copied to auxiliary destination have been deleted."


and


"The messages in alert.log indicate that rman cannot use a previous datafile copy for those files and that the files must be copied again to auxiliary destination. Duplicate checks if there are datafile copies already created by a duplicate to avoid restoring the files again.


So this is a part of Oracle restore optimization concept:

"...from 11.2 RMAN always checks if the file copy exists at destination hosts to avoid copying it again"


Great, but how does RMAN do that?

"If duplicate fails between the first restore/copy and the zeroing of the dbid, a second duplicate will find the _rm_dup_@.dat file and the clone default channel will read it into memory, once determined the name of the datafilecopy to be created by the second duplicate it is compared with the existing datafilecopy from the previous duplicate. If the datafilecopy still exists and matches the vital information of the datafile (file number, database id, creation scn, database name) and its checkpoint scn is behind the until scn then the datafilecopy can be used by this new duplicate and restore/copy isnot necessary"


The solution is simple:

"As files have already been deleted from auxiliary destination, ignore those messages."

I could also have followed the following advice:

"If you don't want to see those messages in alert.log but datafiles have already been deleted, on Auxiliary host, delete the file $ORACLE_HOME/dbs/_rm_dup_.dat where dup_db is the name of the clone instance."


Tuesday, July 21, 2015

How to solve ORA-17628: Oracle error 19505 returned by remote Oracle server during clone from active database

After some time, my RMAN "duplicate from active database" script threw the error below:

RMAN-03009: failure of backup command on c1 channel at 07/21/2015 12:13:15
ORA-17627: ORA-12577: Message 12577 not found;  product=RDBMS; facility=ORA
RMAN-12019: continuing other job steps, job failed will not be re-run

When I checked the alert log, it was clear

IBM AIX RISC System/6000 Error: 28: No space left on device

Solution: obvious.

How to solve ORA-17627: ORA-12154: TNS:could not resolve the connect identifier specified during clone from active database

During an attempt to clone from active database, the following error was thrown:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 07/21/2015 10:06:15
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
RMAN-03009: failure of backup command on c1 channel at 07/21/2015 10:06:15
ORA-17629: Cannot connect to the remote database server
ORA-17627: ORA-12154: TNS:could not resolve the connect identifier specified
ORA-17629: Cannot connect to the remote database server

This seems pretty obvious, since the error stack states that there is a connection problem, but it's easy to overlook if you run your command from the destination server, and like in my case, checked that all connections work, that the passwords are identical etc etc.

The solution is simply to add the source and the destination connect descriptor in the $TNS_ADMIN/tnsnames.ora files on both the source and auxiliary server.


See my other article on active database cloning for a more detailed description of the required setup.

Tuesday, February 17, 2015

How cloning from backup became a little more secure in Oracle 11g

From version 11, Oracle supplied another method to use during cloning, namely "Backup-based duplication without a target connection".

From the Oracle 11g Documentation:

RMAN can perform the duplication in any of the following supported modes:

1.       Active duplication
2.       Backup-based duplication without a target connection
3.       Backup-based duplication with a target connection
4.       Backup-based duplication without connection to target and recovery catalog

If you choose method number 2, you need to use the following syntax:

DUPLICATE DATABASE <target database> TO <auxiliary database>;

In the 10g documentation you will need to use the "target" keyword
DUPLICATE TARGET DATABASE TO <auxiliary database>;

Oracle points out:
«This mode is useful when the target database is not available or a connection to it is not desirable». 

In other words, cloning has become more secure since version 11g, where we can totally avoid connecting to the target database, which is often in production, during cloning from backup.

An example of an incident where a connection to the target could potentially jeopardize production, is when you are scripting jobs for RMAN and accidently issue "shutdown", which will bring down your target database and not your auxiliary database, which was what you intended.

For the record, the keyword "target" is an unfortunate choice of syntax during cloning. In IT, a "target" would generally be interpreted as a synonym for "destination", which is where you want to clone *TO*.

Wednesday, January 21, 2015

How to relocate the block change tracking file

To relocate the block change tracking file you have two options:

1) shutdown database, mount database, update control file, open database
sqlplus / as sysdba
shutdown immediate
exit
 -- Move the block change tracking file to the new location using the appropriate os utility. --
sqlplus / as sysdba
startup mount
ALTER DATABASE RENAME FILE 'ora_home/dbs/change_trk.f' TO '/new_disk/change_trk.f'; 
ALTER DATABASE OPEN;

OR

2) disable and re-enable block change tracking, and point to the new location when re-enabling.
ALTER DATABASE DISABLE BLOCK CHANGE TRACKING;
ALTER DATABASE ENABLE BLOCK CHANGE TRACKING USING FILE 'new_location';

See Section "4.4.4.3 Moving the Change Tracking File" in the Oracle Documentation regarding this feature.

Monday, December 15, 2014

How to solve "ORA-06502: PL/SQL: numeric or value error" when cloning a database

While attempting to clone an Oracle 11gR2 production database, Oracle returned "ORA-06502: PL/SQL: numeric or value error".

The clone proceeded nicely up until RMAN attempted to set the db_unique_name. After this point, the clone failed, see error message output below:
Oracle instance started

Total System Global Area    8551575552 bytes

Fixed Size                     2263488 bytes
Variable Size               1107297856 bytes
Database Buffers            7415529472 bytes
Redo Buffers                  26484736 bytes

contents of Memory Script:
{
   sql clone "alter system set  db_name =
   ''testdb01'' comment=
   ''Reset to original value by RMAN'' scope=spfile";
   sql clone "alter system reset  db_unique_name scope=spfile";
   shutdown clone immediate;
}

executing Memory Script

sql statement: alter system set  db_name =  ''testdb01'' comment= ''Reset to original value by RMAN'' scope=spfile

sql statement: alter system reset  db_unique_name scope=spfile

Oracle instance shut down


RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 12/11/2014 13:20:45
RMAN-05501: aborting duplication of target database
RMAN-06136: ORACLE error from auxiliary database: ORA-06502: PL/SQL: numeric or value error
The trace file also adds the following line to the previous lines:
ORA-06512: at "SYS.X$DBMS_BACKUP_RESTORE", line 5567

The culprit seems to be that my production database and the auxiliary database had at some point in the past been part of a Data Guard setup.
By looking at the registered databases in the recovery catalog, this could indeed be verified:

rman target / catalog uid/pwd@rmancat 

Recovery Manager: Release 11.2.0.4.0 - Production on Mon Dec 15 11:33:55 2014

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
 
connected to target database: proddb01 (DBID=3046916437)
connected to recovery catalog database

RMAN> set echo on;
2> list db_unique_name of database;
4> exit
echo set on
List of Databases

DB Key  DB Name  DB ID            Database Role    Db_unique_name
------- ------- ----------------- ---------------  ------------------
133048143 proddb01     3046916437       PRIMARY          proddb01
133048143 proddb01     3046916437       STANDBY          testdb01

Solution:
In this case, this old association to an obsolete standby database could be removed altogether.
I therefore unregistered the unique database from the recovery catalog:
RMAN> unregister db_unique_name 'testdb01';

database db_unique_name is "testdb01", db_name is "proddb01" and DBID is 3046916437

Want to unregister the database with target db_unique_name (enter YES or NO)? yes
database with db_unique_name testdb01 unregistered from the recovery catalog

After this was done, I restarted the clone, and the error was no longer thrown and the script continued as expected.

If you are in a slightly different position, and have several primary databases with the same unique name, you need a different approach. Consider the following example:
RMAN> list db_unique_name of database;

List of Databases

DB Key  DB Name  DB ID            Database Role    Db_unique_name
------- ------- ----------------- ---------------  ------------------
376254740 PDB01    3895530761       PRIMARY          PDB01
380374509 PDB01    3902464629       PRIMARY          PDB01
383100641 PDB01    3906391689       PRIMARY          PDB01

RMAN> SET DBID 3895530761;

executing command: SET DBID
database name is "PDB01" and DBID is 3895530761

RMAN> UNREGISTER DB_UNIQUE_NAME PDB01;

database db_unique_name is "PDB01", db_name is "PDB01" and DBID is 3895530761

Want to unregister the database with target db_unique_name (enter YES or NO)? yes

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-06004: ORACLE error from recovery catalog database: RMAN-20244: can not change currently connected database db_unique_name
RMAN> exit

Solution is to still explicitly set DBID, but then execute "unregister database" instead, like this:

RMAN> unregister database;

database name is "PDB01" and DBID is 3895530761

Do you really want to unregister the database (enter YES or NO)? yes

database unregistered from the recovery catalog

RMAN> SET DBID 3902464629;
executing command: SET DBID
database name is "PDB01" and DBID is 3902464629 

RMAN> unregister database;
database name is "PDB01" and DBID is 3902464629

Do you really want to unregister the database (enter YES or NO)? YES

RMAN> list db_unique_name of database;

List of Databases

DB Key  DB Name  DB ID            Database Role    Db_unique_name

------- ------- ----------------- ---------------  ------------------
383100641 PDB01    3906391689       PRIMARY          PDB01

We are left with only one unique db name, which is what we want.

Thursday, November 20, 2014

How to flash the database back to a guaranteed restore point

After a week of testing their new release, my customer wanted their database reset to the state it was previously in.
Earlier this week, I created a restore point of type GUARANTEED FLASHBACK.

Here is how I restored a database using flashback database technology. It was, as expected, very quick. Only a few seconds to flash back 14 GB of changes.

Connect to Recovery Manager:
oracle@myserver:[TESTDB01]# rman target / catalog uid/pwd@rmancat

Recovery Manager: Release 11.2.0.4.0 - Production on Thu Nov 20 16:06:44 2014

connected to target database: TESTDB01 (DBID=411134280, not open)
connected to recovery catalog database

RMAN> list restore point all;

SCN              RSP Time  Type          Time         Name
---------------- --------- ----------   ---------     ----
153050263689                GUARANTEED  17-NOV-14     AKSEPT_TEST_START

Start the flashback operation:
RMAN> FLASHBACK DATABASE TO RESTORE POINT AKSEPT_TEST_START;
Starting flashback at 20-NOV-14
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=585 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Data Protection for Oracle: version 6.3.0.0
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=683 device type=DISK

starting media recovery

archived log for thread 1 with sequence 4651 is already on disk as file /u04/fast_recovery_area/TESTDB01/archivelog/2014_11_17/o1_mf_1_4651_b6m91zog_.arc
media recovery complete, elapsed time: 00:00:01
Finished flashback at 20-NOV-14 

RMAN> ALTER DATABASE OPEN RESETLOGS;

database opened
new incarnation of database registered in recovery catalog
starting full resync of recovery catalog
full resync complete
RMAN> exit


-- all done --

From alert.log:
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 16:10:37 2014
alter database recover datafile list clear
Completed: alter database recover datafile list clear
RMAN flashback database to before scn 153050263690 in incarnation 2
Flashback Restore Start
Thu Nov 20 16:12:11 2014
Flashback Restore Complete
Flashback Media Recovery Start
started logmerger process
Parallel Media Recovery started with 8 slaves
Thu Nov 20 16:12:22 2014
Flashback Media Recovery Log /u04/fast_recovery_area/TESTDB01/archivelog/2014_11_17/o1_mf_1_4651_b6m91zog_.arc
Thu Nov 20 16:12:22 2014
Incomplete Recovery applied until change 153050263690 time 11/17/2014 07:53:33
Flashback Media Recovery Complete
Completed: RMAN flashback database to before scn 153050263690 in incarnation 2
Thu Nov 20 16:13:29 2014
alter database open resetlogs
RESETLOGS after incomplete recovery UNTIL CHANGE 153050263690
Archived Log entry 4827 added for thread 1 sequence 4826 ID 0x18811648 dest 1:
Archived Log entry 4828 added for thread 1 sequence 4823 ID 0x18811648 dest 1:
Archived Log entry 4829 added for thread 1 sequence 4822 ID 0x18811648 dest 1:
Archived Log entry 4830 added for thread 1 sequence 4827 ID 0x18811648 dest 1:
Archived Log entry 4831 added for thread 1 sequence 4824 ID 0x18811648 dest 1:
Archived Log entry 4832 added for thread 1 sequence 4825 ID 0x18811648 dest 1:
Clearing online redo logfile 1 /u03/oradata/TESTDB01/redo01.log
Clearing online log 1 of thread 1 sequence number 4826
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 /u03/oradata/TESTDB01/redo02.log
Clearing online log 2 of thread 1 sequence number 4823
Clearing online redo logfile 2 complete
Clearing online redo logfile 3 /u03/oradata/TESTDB01/redo03.log
Clearing online log 3 of thread 1 sequence number 4822
Thu Nov 20 16:13:41 2014
Clearing online redo logfile 3 complete
Clearing online redo logfile 4 /u03/oradata/TESTDB01/redo04.log
Clearing online log 4 of thread 1 sequence number 4827
Clearing online redo logfile 4 complete
Clearing online redo logfile 5 /u03/oradata/TESTDB01/redo05.log
Clearing online log 5 of thread 1 sequence number 4824
Clearing online redo logfile 5 complete
Clearing online redo logfile 6 /u03/oradata/TESTDB01/redo06.log
Clearing online log 6 of thread 1 sequence number 4825
Thu Nov 20 16:13:52 2014
Clearing online redo logfile 6 complete
Resetting resetlogs activation ID 411113032 (0x18811648)
Online log /u03/oradata/TESTDB01/redo01.log: Thread 1 Group 1 was previously cleared
Online log /u03/oradata/TESTDB01/redo02.log: Thread 1 Group 2 was previously cleared
Online log /u03/oradata/TESTDB01/redo03.log: Thread 1 Group 3 was previously cleared
Online log /u03/oradata/TESTDB01/redo04.log: Thread 1 Group 4 was previously cleared
Online log /u03/oradata/TESTDB01/redo05.log: Thread 1 Group 5 was previously cleared
Online log /u03/oradata/TESTDB01/redo06.log: Thread 1 Group 6 was previously cleared
Thu Nov 20 16:13:52 2014
Setting recovery target incarnation to 3

Tuesday, October 21, 2014

How to view stored scripts in RMAN

Three different ways to print scripts stored in your recovery catalog:

list all script names;
list global script names;
print script my_script;

Friday, August 1, 2014

Why are there multiple rows in V$SESSION_LONGOPS when running an RMAN backup?

Answer:
When performing an RMAN backup, V$SESSION_LONGOPS will be populated with two types of rows:

Detailed rows: Captures the progress of individual job steps.

Aggregated rows: Captures and summarizes the progress for the individual step of the backup job, as they complete. They are updated only after each detailed step has finished.

Example:
Below I am querying the V$SESSION_LONGOPS after an incremental backup level 0 has started, using 6 channels. Notice that there are both individual rows and an aggregated row present:
SELECT SID,SERIAL#,OPNAME,TARGET_DESC,TOTALWORK,SOFAR,UNITS,START_TIME,TIME_REMAINING,ELAPSED_SECONDS,MESSAGE
FROM V$SESSION_LONGOPS 
WHERE  TOTALWORK <> 0
AND SOFAR <> TOTALWORK
ORDER BY SERIAL# ASC;

SID SERIAL# OPNAME TARGET_DESC TOTALWORK SOFAR UNITS START_TIME TIME_REMAINING ELAPSED_SECONDS MESSAGE
203
7409
RMAN: incremental datafile backup Set Count
4096000
1004926
Blocks 01.08.2014 12:28:28
557
181
RMAN: incremental datafile backup: Set Count 137085: 1004926 out of 4096000 Blocks done
398
15079
RMAN: incremental datafile backup Set Count
3018368
1122302
Blocks 01.08.2014 12:28:28
306
181
RMAN: incremental datafile backup: Set Count 137088: 1122302 out of 3018368 Blocks done
439
17451
RMAN: incremental datafile backup Set Count
2560000
1127038
Blocks 01.08.2014 12:28:29
229
180
RMAN: incremental datafile backup: Set Count 137089: 1127038 out of 2560000 Blocks done
357
31627
RMAN: incremental datafile backup Set Count
3044608
1115518
Blocks 01.08.2014 12:28:28
313
181
RMAN: incremental datafile backup: Set Count 137087: 1115518 out of 3044608 Blocks done
476
55467
RMAN: incremental datafile backup Set Count
2557440
940542
Blocks 01.08.2014 12:28:29
309
180
RMAN: incremental datafile backup: Set Count 137090: 940542 out of 2557440 Blocks done
512
55527
RMAN: aggregate input backup
76514816
4360436
Blocks 01.08.2014 12:28:28
2085
126
RMAN: aggregate input: backup 33: 4360436 out of 76514816 Blocks done
281
56085
RMAN: incremental datafile backup Set Count
3251200
969214
Blocks 01.08.2014 12:28:28
426
181
RMAN: incremental datafile backup: Set Count 137086: 969214 out of 3251200 Blocks done

Let's join V$SESSION_LONGOPS with V$SESSION and (if desirable) V$PROCESS, to view only the detailed rows, and estimate the progress in percent. This query is very useful to quickly get an overview of your backup, whether or not it is progressing according to expected speed etc:

SELECT S.CLIENT_INFO "Client Info", SL.OPNAME "Operation" ,SL.MESSAGE, SL.SID, SL.SERIAL#, P.SPID "OS Process ID", SL.SOFAR "So Far", SL.TOTALWORK "Totalwork", ROUND(SL.SOFAR/SL.TOTALWORK*100,2) "% complete"
FROM V$SESSION_LONGOPS SL INNER JOIN V$SESSION S ON SL.SID = S.SID 
                          INNER JOIN V$PROCESS P ON P.ADDR = S.PADDR
AND OPNAME LIKE 'RMAN%'
AND OPNAME NOT LIKE '%aggregate%'
AND TOTALWORK != 0
AND SOFAR <> TOTALWORK
/
Client Info Operation MESSAGE SID SERIAL# OS Process ID So Far Totalwork % complete
rman channel=tsm_channel_0 RMAN: incremental datafile backup RMAN: incremental datafile backup: Set Count 137096: 1173118 out of 2252800 Blocks done
203
7409
42926864
1173118
2252800
52,07
rman channel=tsm_channel_1 RMAN: incremental datafile backup RMAN: incremental datafile backup: Set Count 137095: 2029054 out of 2393600 Blocks done
281
56085
52101750
2029054
2393600
84,77
rman channel=tsm_channel_2 RMAN: incremental datafile backup RMAN: incremental datafile backup: Set Count 137097: 293246 out of 2096640 Blocks done
357
31627
56361104
293246
2096640
13,99
rman channel=tsm_channel_3 RMAN: incremental datafile backup RMAN: incremental datafile backup: Set Count 137099: 174206 out of 2048000 Blocks done
398
15079
52756692
174206
2048000
8,51
rman channel=tsm_channel_4 RMAN: incremental datafile backup RMAN: incremental datafile backup: Set Count 137098: 460286 out of 2048000 Blocks done
439
17451
29032454
460286
2048000
22,47
rman channel=tsm_channel_5 RMAN: incremental datafile backup RMAN: incremental datafile backup: Set Count 137100: 196094 out of 2048000 Blocks done
476
55467
9700724
196094
2048000
9,57

Notice how the V$RMAN_STATUS only hold information on aggregated level (2 rows, one for each operation although the first spawns the other):
SELECT SID,OPERATION,STATUS,MBYTES_PROCESSED, START_TIME, END_TIME, OBJECT_TYPE, OUTPUT_DEVICE_TYPE 
FROM V$RMAN_STATUS WHERE STATUS = 'RUNNING';

SID OPERATION STATUS MBYTES_PROCESSED START_TIME OBJECT_TYPE OUTPUT_DEVICE_TYPE
512
BACKUP RUNNING
226641,953125
01.08.2014 12:28:28 DB INCR SBT_TAPE
512
RMAN RUNNING
0
01.08.2014 12:28:28