推陈出新:12C 推进 SCN 新方法实践
简介:
黄廷忠,云和恩墨西区交付总监,超过10年专职电信行业 Oracle 数据库管理和运维经验,熟悉电信行业的业务、数据库及硬件架构。擅长数据库各种迁移方法、优化、疑难故障排除、数据库异常恢复等。
在数据库异常恢复中,经常需要修改数据库的 SCN 值,在 12C 之前,我们常用的方法有如下几个:
oradebug poke 直接修改内存中的值;
event 10015 来增加 scn 的值;
_minimum_giga_scn 来增加 scn 的值;
gdb/dbx 来直接修改内存中的值;
修改控制文件来修改 scn 的值;
修改数据文件头来修改 scn 的值;
adjust_scn 来增加scn。
在这几种方法中,2、3和4方法在2012年中,Oracle 已经通过补丁更新,导致方法失效,就只有1,4,5,6方法。在这几种方法中,我们常常使用方法1来修改 scn 的值,但是此方法在 12C 中,Oracle 也已经屏蔽,还可以继续使用4,5,6 的方法来实现修改 SCN。在12.2 中,Oracle 多了一个新的 EVENT 21307096 也可以实现增加 SCN 的值。
下面在自己的测试环境模拟一下 EVENT 21307096,请勿在生产环境中操作;如果需要在生产环境中操作,提前做好备份。其它的方法,请 GOOGLE 都可以找到方法。
1EVENT 21307096 描述
下面描述来至 Oracle 官方文档
The SCN delta in million units is with the range of values from 1 to 4095 which increases
the scn by:
lowest_scn + event level * 1000000
Example: if the lowest datafile checkpoint scn in the database is 990396
and the highest is 992660 then SCN delta is 1; given by (992660 - 990396) / 1000000
event="21307096 trace name context forever, level 1
Here are some tests in 12.1.0.2 using each level for alter database open resetlogs:
level 1 Elapsed: 00:01:02.35
level 2 Elapsed: 00:02:16.23
level 6 Elapsed: 00:06:08.05
In general: based on a 16k per second scn rate (16K/sec) , the open resetlogs time
would be at least (event level * 1000000 / 16000) seconds. Then level 1 would be at least
62+ seconds and level 4095 would be 71+ hours !.
2测试修改 SCN 的值
2.1 查看当前 SCN 的值
这里查询当前 SCN 的值,主要用于与后面修改后的值做对比。
SQL> startup
ORACLE instance started.
Total System Global Area 1157627904 bytes
Fixed Size 8619936 bytes
Variable Size 436209760 bytes
Database Buffers 704643072 bytes
Redo Buffers 8155136 bytes
Database mounted.
Database opened.
SQL>
SQL>
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
4422688
2.2 修改参数文件
通过 PFILE 文件来增加 EVENT。
SQL> !cat /tmp/1234.ora
event="21307096 trace name context forever, level 3"
2.3 修改后数据库 SCN 的值
数据库使用的 shutdown immedaite 关闭的,
SQL> startup mount pfile='/tmp/1234.ora';
ORACLE instance started.
Total System Global Area 1157627904 bytes
Fixed Size 8619936 bytes
Variable Size 436209760 bytes
Database Buffers 704643072 bytes
Redo Buffers 8155136 bytes
Database mounted.
SQL> recover database using backup controlfile until cancel;
ORA-00279: change 4422836 generated at 03/18/2017 07:58:31 needed for thread 1
ORA-00289: suggestion :
/oracle/app/oracle/product/12.2.0/dbhome_1/dbs/arch1_1_938937035.dbf
ORA-00280: change 4422836 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
SQL> alter database open resetlogs;
Database altered.
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
7423640 看到 SCN 的值增加了 300W。
2.4 查看 alert 日志内容
查看 ALERT 日志,可以发现数据库在 OPEN 过程中,消耗了几分钟的时间。此时间可以通过 EVENT 描述中的计算公式来计算。
Completed: ALTER DATABASE MOUNT
2017-03-18T07:59:40.236101+08:00
ALTER DATABASE RECOVER database using backup controlfile until cancel
2017-03-18T07:59:40.236186+08:00
Media Recovery Start
2017-03-18T07:59:40.236823+08:00
Serial Media Recovery started
ORA-279 signalled during: ALTER DATABASE RECOVER database using backup controlfile until cancel ...
2017-03-18T07:59:42.034160+08:00
ALTER DATABASE RECOVER CANCEL
2017-03-18T07:59:42.040794+08:00
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL
2017-03-18T07:59:48.188727+08:00
alter database open resetlogs
2017-03-18T07:59:48.189864+08:00
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 4422836 time
Clearing online redo logfile 1 /oracle/app/oracle/oradata/htz/redo01.log
Clearing online log 1 of thread 1 sequence number 1
Clearing online redo logfile 1 complete
Resetting resetlogs activation ID 1993705771 (0x76d5892b)
Online log /oracle/app/oracle/oradata/htz/redo01.log: Thread 1 Group 1 was previously cleared
Online log /oracle/app/oracle/oradata/htz/redo02.log: Thread 1 Group 2 was previously cleared
Online log /oracle/app/oracle/oradata/htz/redo03.log: Thread 1 Group 3 was previously cleared
2017-03-18T07:59:48.690166+08:00
Setting recovery target incarnation to 4
2017-03-18T07:59:48.693973+08:00
Ping without log force is disabled:
instance mounted in exclusive mode. 会在这里 HANG 住,根据 LEVEL 的值不同,这个 HANG 的时候也不同。目前这里 LEVEL 值是多少,在这里消耗的时间就是多少。
2017-03-18T08:02:51.712840+08:00
Endian type of dictionary set to little
2017-03-18T08:02:51.716445+08:00
Assigning activation ID 1993737045 (0x76d60355)
Thread 1 opened at log sequence 1
Current log# 1 seq# 1 mem# 0: /oracle/app/oracle/oradata/htz/redo01.log
Successful open of redo thread 1
2017-03-18T08:02:51.734620+08:00
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
2017-03-18T08:02:51.758900+08:00
TT00: Gap Manager starting (PID:9735)
2017-03-18T08:02:51.846445+08:00
Sleep 5 seconds and then try to clear SRLs in 2 time(s)
2017-03-18T08:02:51.887673+08:00
[9714] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:5652874 end:5652906 diff:32 ms (0.0 seconds)
Dictionary check beginning
Dictionary check complete
Verifying minimum file header compatibility for tablespace encryption..
Verifying file header compatibility for tablespace encryption completed for pdb 0
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process AQPC
2017-03-18T08:02:52.207519+08:00
AQPC started with pid=38, OS id=9743
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Starting background process CJQ0
Completed: alter database open resetlogs
2017-03-18T08:02:52.618086+08:00
CJQ0 started with pid=42, OS id=9751
2017-03-18T08:02:54.893495+08:00
Shared IO Pool defaulting to 64MB. Trying to get it from Buffer Cache for process 9698.
===========================================================
Dumping current patch information
===========================================================
No patches have been applied
===========================================================