Minimalistic Oracle: Data Guard

Showing posts with label Data Guard. Show all posts

Wednesday, May 29, 2024

How I removed a data guard config completely

Here is my config:

DGMGRL> show configuration;

Configuration - DGConfig1

  Protection Mode: MaxPerformance
  Members:
  kej01       - Primary database
    kej01_stby1 - Physical standby database

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 66 seconds ago)

1. stop the redo apply process:

DGMGRL>  edit database kej01_stby1 set state='APPLY-OFF';
Succeeded.

2. remove the standby database from the configuration:

DGMGRL> remove database kej01_stby1;
Removed database "kej01_stby1" from the configuration

3. Remove the configuration itself:

DGMGRL> remove configuration;
Removed configuration

4. On both the primary and standby server, edit the $TNS_ADMIN/listener.ora by removing these entries:


(SID_DESC =
      (GLOBAL_DBNAME = kej01_DGMGRL)
      (ORACLE_HOME = /orasw/19c)
      (SID_NAME = kej01)
    )
   (SID_DESC =
      (GLOBAL_DBNAME = kej01_DGMGRL.skead.no)
      (ORACLE_HOME = /orasw/19c)
      (SID_NAME = kej01)
    )

Make sure to stop/start the listener afterwards.
5. On previous standby, remove db_unique_name:

SYS@kej01>SQL>alter system set db_unique_name='kej01' scope=spfile;

6. On both servers, stop the broker processes:

SYS@kej01>SQL>alter system set dg_broker_start=false scope=both;

System altered.

6. On the standby, finish database recovery:

SYS@kej01>alter database recover managed standby database finish;

Database altered.

The database is still mounted as a physical standby:

SYS@kej01>SQL>select open_mode, database_role from v$database;

OPEN_MODE            DATABASE_ROLE
-------------------- ----------------
MOUNTED              PHYSICAL STANDBY

If you open it now, it will be in READ ONLY status:

SYS@kej01>SQL>alter database open;

Database altered.

SYS@kej01>select open_mode, database_role from v$database;

OPEN_MODE            DATABASE_ROLE
-------------------- ----------------
READ ONLY            PHYSICAL STANDBY

7. Instruct the former standby database that it is now indeed a normal ("primary") database:

SYS@kej01>ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;

Database altered.

The alert log will show that we are cutting all strings to the previous role:

2024-05-29T15:55:09.123174+02:00
ALTER DATABASE SWITCHOVER TO PRIMARY (kej01)
2024-05-29T15:55:09.125019+02:00
Maximum wait for role transition is 15 minutes.
TMI: kcv_commit_to_so_to_primary wait for MRP to finish BEGIN 2024-05-29 15:55:09.125651
TMI: kcv_commit_to_so_to_primary wait for MRP to finish END 2024-05-29 15:55:09.126990
TMI: kcv_commit_to_so_to_primary Switchover from physical BEGIN 2024-05-29 15:55:09.130097

Standby terminal recovery start SCN: 4695193
RESETLOGS after incomplete recovery UNTIL CHANGE 4695565 time 05/29/2024 15:28:15
ET  (PID:1614518): ORL pre-clearing operation disabled by switchover
Online log /log/oradata/kej01/redo1.log: Thread 1 Group 1 was previously cleared
Online log /log/oradata/kej01/redo2.log: Thread 1 Group 2 was previously cleared
Online log /log/oradata/kej01/redo3.log: Thread 1 Group 3 was previously cleared
Standby became primary SCN: 4695192
2024-05-29T15:55:16.588617+02:00
Setting recovery target incarnation to 3
2024-05-29T15:55:16.662719+02:00
NET  (PID:1614518): Database role cleared from PHYSICAL STANDBY [kcvs.c:1133]
Switchover: Complete - Database mounted as primary
TMI: kcv_commit_to_so_to_primary Switchover from physical END 2024-05-29 15:55:16.667784
Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN
2024-05-29T15:55:27.547369+02:00
alter database open

At this point, my standby database is once again mounted, but it has now assumed the PRIMARY role, which is what I want:

SYS@kej01>SQL>select open_mode, database_role from v$database;

OPEN_MODE            DATABASE_ROLE
-------------------- ----------------
MOUNTED              PRIMARY

8. Open it now, and it will be in read write mode:

SYS@kej01>SQL>alter database open;

Database altered.

SYS@kej01>SQL>select open_mode, database_role from v$database;

OPEN_MODE            DATABASE_ROLE
-------------------- ----------------
READ WRITE           PRIMARY

According to the oracle documentation, the syntax above is pre-12.1 style, which is still compatible but DBAs are encouraged to use

ALTER DATABASE SWITCHOVER TO target_db_name [FORCE] [VERIFY];
and
ALTER DATABASE FAILOVER TO target_db_name;

instead.

Wednesday, January 11, 2023

More on setting redo routes property when adding a terminal standby database to your data guard broker configuration

As described in an earlier post, it is possible to set up a "terminal standby database" which fetches its redo information from another standby database, rather than directly from the primary database.

It is quite facinating to see how the data guard broker simplifies this setup for the DBA.

In a recent excercise @ work, I had a data guard configuration consisting of 1 primary and 3 physical standby database.

A fourth was to be added as a terminal standby database.

First, clone the new database for standby.

You can use any of the databases in the configuration as a source, from which to create a terminal database, both primary or any of the mounted physical standby databases.

When the clone is finished, add the new database to the broker config:

dgmgrl -echo sys/password@primdb_dgmgrl.oric.no as sysdba

Welcome to DGMGRL, type "help" for information.
Connected to "primdb"
Connected as SYSDBA.

Add the new database:

DGMGRL> add database 'tstby' as connect identifier is tstby.oric.no maintained as physical;
add database 'tstby' as connect identifier is tstb.oric.no maintained as physical;
Database "tstby" added
DGMGRL> show configuration;
show configuration;

Configuration - dgconfig1

  Protection Mode: MaxPerformance
  Members:
  primdb     - Primary database
    stby1     - Physical standby database
    stby2     - Physical standby database
    stby3     - Physical standby database
    tstby    - Physical standby database (disabled)

Fast-Start Failover: DISABLED

Notice how all the standby databases are indented directly underneath the primary database, indicating that they receive their redo information directly from the primary database.

I then add the redoroutes property to the primary:

DGMGRL> edit database primdb set property redoroutes='(LOCAL:stby1,stby2,stby3 ASYNC)(stby1:tstby ASYNC)';
edit database primdb set property redoroutes='(LOCAL:stby1,stby2,stby3 ASYNC)(stby1:tstby ASYNC)';
Property "redoroutes" updated

Add the redoroutes property for when the primary and the choosen standby switches roles:

DGMGRL> edit database stby1 set property redoroutes='(LOCAL:primdb,stby2,stby3 ASYNC)(primdb:tstb ASYNC)';
edit database stby1 set property redoroutes='(LOCAL:primdb,stby2,stby3 ASYNC)(primdb:tstb ASYNC)';
Property "redoroutes" updated

Note that both of these rules must be set, otherwise your terminal standby database will not receive logs. You will see messages like the follwing:

Configuration - dgconfig1

  Protection Mode: MaxPerformance
  Members:
  primdb     - Primary database
    stby1  - Physical standby database
    stby2  - Physical standby database
    stby3  - Physical standby database

  Members Not Receiving Redo:
  tstby  - Physical standby database
    Error: ORA-16685: database does not receive redo data

View the configuration again
You will see that the broker has understood that the tstb is acting like a terminal standby database for physical standby database "stby2":

DGMGRL> show configuration;
show configuration;

Configuration - dgconfig1

  Protection Mode: MaxPerformance
  Members:
  primdb     - Primary database
    stby1  - Physical standby database
      tstby  - Physical standby database (disabled)
    stby2 - Physical standby database
    stby3 - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 27 seconds ago)

Finally, enable the database:

DGMGRL> enable database 'tstb';
enable database 'tsbt';
Enabled.

The output you should see at the end is:

DGMGRL> show configuration;
show configuration;

Configuration - dgconfig1

  Protection Mode: MaxPerformance
  Members:
  primdb     - Primary database
    stby1  - Physical standby database
      tstby  - Physical standby database (receiving current redo)
    stby2 - Physical standby database
    stby3 - Physical standby database

Tuesday, December 6, 2022

SWITCHOVER VERIFY WARNING: switchover target has no standby database defined in LOG_ARCHIVE_DEST_n parameter.

When performing a switchover verificaton from your primary database, in your Data Guard setup, you may see the following:

sqlplus / as sysdba

SQL>  ALTER DATABASE SWITCHOVER TO stb verify;
 ALTER DATABASE SWITCHOVER TO stb verify
*
ERROR at line 1:
ORA-16475: succeeded with warnings, check alert log for more details

Alert log reports:
2022-12-06T09:56:34.020025+01:00
ALTER DATABASE SWITCHOVER TO stb verify
2022-12-06T09:56:34.192599+01:00
SWITCHOVER VERIFY: Send VERIFY request to switchover target STB
SWITCHOVER VERIFY WARNING: switchover target has no standby database defined in LOG_ARCHIVE_DEST_n parameter.
If the switchover target is converted to a primary database, the new primary database will not be protected.
ORA-16475 signalled during: ALTER DATABASE SWITCHOVER TO stb verify...

Solution:
Update the standby database log_archive_dest_n parameter, to prepare it for a future primary role.

In the standby database, update one of the log_archive_dest_n parameters, I picked the next available from the list, log_archive_dest_2:

alter system set log_archive_dest_2='service=primary.oric.no LGWR ASYNC VALID_FOR=(ONLINE_LOGFILE,PRIMARY_ROLE) DB_UNIQUE_NAME=primary';

Run the verification again:

SQL>  ALTER DATABASE SWITCHOVER TO stb verify;

Database altered.

Check the alert log and it will confirm that the database stb can now be turned into a primary database:

2022-12-06T10:03:34.605309+01:00
ALTER DATABASE SWITCHOVER TO stb verify
2022-12-06T10:03:34.773710+01:00
SWITCHOVER VERIFY: Send VERIFY request to switchover target STB
SWITCHOVER VERIFY COMPLETE: READY FOR SWITCHOVER
Completed: ALTER DATABASE SWITCHOVER TO stb verify

Thursday, April 15, 2021

How to set up "RedoRoutes" in a Data Guard Broker configuration

In this example, the following members participate in my Data Guard Configuration:

Database Name	Role	Open Mode	Function
pksprod	Primary	OPEN	Primary database
pks_stb	Cascading Physical Standby	MOUNTED	Used for failover
pks_ro	Active Data Guard	READ ONLY WITH APPLY	Used for reporting
pks_tstb	Terminal Physical Standby	MOUNTED	Used for migration to a new geographical location

I am using the concept of a "Terminal Standby Database" to move the database from one geographical location to another.
In order for the cascading database to send its redo log stream to the terminal standby database, I had to configure the Data Guard Broker attribute "redoroutes", like this:

edit database "pksprod" set property redoroutes='(LOCAL : pks_stb, pks_ro ASYNC) (pks_stb : pks_tstb ASYNC)';
edit database "pks_stb" set property redoroutes =' (LOCAL :pksprod ASYNC, pks_ro ASYNC)(pksprod : pks_tstb ASYNC)';

which means

* When pksprod is primary, it shall send redo to pks_stb and pks_ro, while pks_stb shall send its redo to pks_tstb
* When pks_stb is primary, it shall send redo to pksprod and pks_ro, while pksprod shall send its redo to pks_tstb

When done, check the outcome like this:

DGMGRL> show database "pksprod" redoroutes
  RedoRoutes = '(LOCAL : pks_stb,pks_ro ASYNC)(pks_stb : pks_tstb ASYNC)'
DGMGRL> show database "pks_stb" redoroutes
  RedoRoutes = '(LOCAL : pksprod ASYNC, pks_ro ASYNC)(pksprod : pks_tstb ASYNC)'

The 12.2 documentation for the RedoRoutes attribute can be found here. You should familarize yourself with the how you can set up the redoroutes to suit your needs. In my case it was the only way I was able to get the DG configuration to work the way I was intending. It was setup using the 12.2 version of the Oracle database software.

Thursday, March 25, 2021

How to view a specific property for a databasen using dgmgrl

DGMGRL> show database 'PROD_STB' 'DbFileNameConvert';

Friday, January 29, 2021

How to manually register missing logsequences on a standby database

I have previously documented how to identify gaps in your standby database's log sequence. See these posts:

How to check if your physical standby database is applying logs or not
How to find out if your standby database lags behind the primary database
How to find the last archivelog received and applied in a standby database

After a successful rescue operation of my standby database, I had a 3-day lag behind the primary. It's easy to identified these using the data guard broker command below:

show database "prod_stby" RecvQEntries

Output from this command was (abbreviated):

STANDBY_RECEIVE_QUEUE
              STATUS     RESETLOGS_ID  THREAD   LOG_SEQ       TIME_GENERATED       TIME_COMPLETED        FIRST_CHANGE#         NEXT_CHANGE#       SIZE (KBs)
         NOT_APPLIED        894886266  1    372460  01/28/2021 16:46:01  01/28/2021 16:47:35        6196037493227        6196037506094            20840
         NOT_APPLIED        894886266  1    372462  01/28/2021 17:02:35  01/28/2021 18:31:30        6196037647447        6196038652945          1350187
         NOT_APPLIED        894886266  1    372463  01/28/2021 18:31:30  01/28/2021 19:09:06        6196038652945        6196039875468          1051704
         .
         .
         .

The logfiles were phyically present in the Flash Recovery Area:

cd /fra/PROD_STBY/archivelog
find . -name "*372108*"
./2021_01_25/o1_mf_1_372108_j0x8j1fc_.arc

There are two ways to inform the standby database about the presence of the logfile:

1. RMAN.
On the standby database:

rman target /

Verify that the standby database does not recognize the archivelog:

list archivelog sequence between 372106 and 372107;

using target database control file instead of recovery catalog
List of Archived Log Copies for database with db_unique_name PROD_STBY
=====================================================================

Key     Thrd Seq     S Low Time
------- ---- ------- - ---------
371883  1    372106  A 25-JAN-21
        Name: /fra/PROD_STBY/archivelog/2021_01_25/o1_mf_1_372106_j0x6x0cj_.arc

The output above confirms that sequence 372106 exists, and that sequence 372107 does not.

To catalog the missing file:

RMAN> catalog start with '/fra/PROD_STBY/archivelog/2021_01_25/o1_mf_1_372107_j0x7k8lh_.arc';

using target database control file instead of recovery catalog
searching for all files that match the pattern /fra/PROD_STB//archivelog/2021_01_25/o1_mf_1_372107_j0x7k8lh_.arc

List of Files Unknown to the Database
=====================================
File Name: /fra/PROD_STB/archivelog/2021_01_25/o1_mf_1_372107_j0x7k8lh_.arc

Do you really want to catalog the above files (enter YES or NO)? YES
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /fra/PROD_STB/archivelog/2021_01_25/o1_mf_1_372107_j0x7k8lh_.arc

Confirm again, and you'll see that the new files is registered:

RMAN> list archivelog sequence between 372106 and 372107;

List of Archived Log Copies for database with db_unique_name PROD_STBY
=====================================================================

Key     Thrd Seq     S Low Time
------- ---- ------- - ---------
371883  1    372106  A 25-JAN-21
        Name: /fra/PROD_STBY/archivelog/2021_01_25/o1_mf_1_372106_j0x6x0cj_.arc

371956  1    372107  A 25-JAN-21
        Name: /fra/PROD_STBY/archivelog/2021_01_25/o1_mf_1_372107_j0x7k8lh_.arc

If the number of logfiles missing is large, use a shortcut to register them all:

RMAN> catalog start with '/fra/PROD_STBY/archivelog/2021_01_25';

The above command will register all logfiles in the directory /fra/PROD_STBY/archivelog/2021_01_25

2. sqlplus:

SQL> alter database register logfile '/fra/PROD_STBY/archivelog/2021_01_25/o1_mf_1_372107_j0x7k8lh_.arc';

If you tail the alert log of the database, you'll see that the standby database quickly picks up the missing logfiles.

Thursday, September 3, 2020

Potential solution to dgmgrl error ORA-16665: time out waiting for the result from a member

After having added a terminal standby database to an existing configuration, the Data Guard Broker configuration seemed unhappy with communicating with the new member. The output from "show configuration" showed the following:

DGMGRL> show configuration;

Configuration - DB01

  Protection Mode: MaxPerformance
  Members:
  DB01      - Primary database
    DB01_STB  - Physical standby database
      DB01_TSTB - Physical standby database (receiving current redo)
        Error: ORA-16665: time out waiting for the result from a member

    DB01_RO   - Physical standby database

When looking at the details by using

show database verbose "DB01_TSTB"

the entire operation would take very long, and at the, the following message is displayed:

Database Status:
DGM-17016: failed to retrieve status for database "DB01_TSTB"
ORA-16665: time out waiting for the result from a member

The broker log file showed:

09/02/2020 15:08:52
Data Guard Broker Status Summary:
  Type                        Name                            Severity  Status
  Configuration               DB01                            Warning  ORA-16607
  Primary Database            DB01                            Success  ORA-0
  Physical Standby Database   DB01_STB                        Success  ORA-0
  Physical Standby Database   DB01_RO                         Success  ORA-0
  Physical Standby Database   DB01_TSTB                       Error  ORA-16665

Root cause here was firewalls. The terminal standby database could not reach the primary database. Although the terminal standby database isn't set up to receive redo data from the primary database directly, in a broker configuration all members must be able to communicate with eachother. A good tool for troubleshooting issues dealing with ports and firewalls is nmap. I installed it on the terminal server and issued:

[root@db04_server ~]# nmap -n -p 1511 db01_sever.oric.no

Starting Nmap 6.40 ( http://nmap.org ) at 2020-09-02 14:23 CEST
Nmap scan report for db01_sever.oric.no (xxx.xxx.xxx.xxx)
Host is up (0.016s latency).
PORT     STATE    SERVICE
1511/tcp filtered 3l-l1

Nmap done: 1 IP address (1 host up) scanned in 0.49 seconds

A filtered port means that it is not possible to determine whether the port is open or closed, most often due to firewalls along the way. Further checks in the firewall log files showed

action=Drop service=1511 dst=xxx.xxx.xxx.xxx scr=yyy.yyy.yyy.yyy

where xxx.xxx.xxx.xxx was matching the ipadress of the terminal standby server, while yyy.yyy.yyy.yyy was matching the ipadress of the primary server. The network admin opened the port, and the ORA-16665 immediately disappeared from the dgmgrl output.