从Oracle到PostgreSQL:最全控制文件(下)
PostgreSQL控制文件恢复测试
测试过程如下(基于PostgreSQL 11.2) :
1. 新建测试数据, 用到with oids的表, 因为OID无法确定, 看看是否会有异常
postgres=# create table lsl_oid1(id int primary key) with oids;
CREATE TABLE
postgres=# insert into lsl_oid1 select generate_series(1,100000);
INSERT 0 100000
postgres=# select min(oid),max(oid) from lsl_oid1 ;
min | max
-------+--------
16400 | 116399
(1 row)
2. 关闭数据,记录下控制文件信息
[postgres@lsl-test1 bin]$ /usr/pgsql-11/bin/pg_ctl stop -D /pg/pg11/data
waiting for server to shut down.... done
server stopped
## 记下pg_controldata信息, 方便修复后进行比对
[postgres@lsl-test1 bin]$ /usr/pgsql-11/bin/pg_controldata -D /pg/pg11/data
pg_control version number: 1100
Catalog version number: 201809051
Database system identifier: 6691945724594983959
Database cluster state: shut down
pg_control last modified: Thu 30 May 2019 05:26:41 PM CST
Latest checkpoint location: 0/79E9888
Latest checkpoint's REDO location: 0/79E9888
Latest checkpoint's REDO WAL file: 000000010000000000000007
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:1048585
Latest checkpoint's NextOID: 116400
Latest checkpoint's NextMultiXactId: 65536
Latest checkpoint's NextMultiOffset: 52352
Latest checkpoint's oldestXID: 561
Latest checkpoint's oldestXID's DB: 13878
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 13878
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint: Thu 30 May 2019 05:26:40 PM CST
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: replica
wal_log_hints setting: off
max_connections setting: 100
max_worker_processes setting: 8
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0
Mock authentication nonce: 0000000000000000000000000000000000000000000000000000000000000000
3. 模拟控制文件故障,直接删除控制文件
[postgres@lsl-test1 global]$ pwd
/pg/pg11/data/global
[postgres@lsl-test1 global]$ rm -rf pg_control
4. 启动数据库,模拟控制文件丢失场景
[postgres@lsl-test1 global]$ /usr/pgsql-11/bin/pg_ctl start
waiting for server to start....postgres: could not find the database system
Expected to find it in the directory "/pg/pg11/data",
but could not open file "/pg/pg11/data/global/pg_control": No such file or directory
stopped waiting
pg_ctl: could not start server
Examine the log output.
下面开始正式重建控制文件,使得数据库可以正常启动。
5. 首先创建一个名为pg_control的空文件
[postgres@lsl-test1 global]$ touch $PGDATA/global/pg_control
6. 使用pg_resetwal修复pg_control
确认pg_resetwal参数值。
首先确认-c参数,上面参数详细分析发现目录下为空,因此暂时忽略此参数。
-e参数是设置下一个事务号的新纪元,我们测试环境没有其它复制系统因此也可以忽略。
-l参数,查看pg_wal下文件文件名,大于文件名最大值即可。
[postgres@lsl-test1 ~]$ cd /pg/pg11/data/pg_wal[postgres@lsl-test1 pg_wal]$ ls -ltotal 32768-rw-------. 1 postgres postgres 16777216 May 30 17:26 000000010000000000000007-rw-------. 1 postgres postgres 16777216 May 30 17:26 000000010000000000000008drwx------. 2 postgres postgres 6 May 17 17:22 archive_status之类要大于最大值,因此我们可以去-l=000000010000000000000009-m参数取pg_multixact/offsets目录下最大值加1然后乘以65536(0x10000)和最小值 乘以65536(0x10000)[postgres@lsl-test1 pg_commit_ts]$ cd ../pg_multixact/offsets/[postgres@lsl-test1 offsets]$ ls -ltotal 8-rwx------. 1 postgres postgres 8192 May 17 18:04 0000
因此-m可以取0x10000,0x00000。
-o参数不确定时,由于测试没有复制软件因此可以暂时忽略。
-O 查找pg_multixact/members目录下数值最大的文件名,+1乘以52352 (0xCC80)。
[postgres@lsl-test1 offsets]$ cd ../members/
[postgres@lsl-test1 members]$ ls -l
total 8
-rwx------. 1 postgres postgres 8192 May 17 17:22 0000
因此-O=0xCC80。
-x参数查找pg_xact目录下可以查看数值最大的文件名,+1乘以 1048576 (0x100000)。
[postgres@lsl-test1 members]$ cd ../../pg_xact/
[postgres@lsl-test1 pg_xact]$ ls -l
total 16
-rwx------. 1 postgres postgres 8192 May 17 18:04 0000
-rw-------. 1 postgres postgres 8192 May 30 17:26 0001
因此-x=0x200000。
不加-f参数时可以查看要写入控制文件中的参数内容。
[postgres@lsl-test1 pg_xact]$ /usr/pgsql-11/bin/pg_resetwal -l 000000010000000000000009 -m 0x10000,0x00000 -O 0xCC80 -x 0x200000 -D /pg/pg11/data
pg_resetwal: oldest multitransaction ID (-m) must not be 0
[postgres@lsl-test1 pg_xact]$ /usr/pgsql-11/bin/pg_resetwal -l 000000010000000000000009 -m 0x10000,0x00001 -O 0xCC80 -x 0x200000 -D /pg/pg11/data
pg_resetwal: pg_control exists but is broken or wrong version; ignoring it
Guessed pg_control values:
pg_control version number: 1100
Catalog version number: 201809051
Database system identifier: 6696828635748080009
Latest checkpoint's TimeLineID: 1
Latest checkpoint's full_page_writes: off
Latest checkpoint's NextXID: 0:3
Latest checkpoint's NextOID: 10000
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 3
Latest checkpoint's oldestXID's DB: 0
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 0
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0Values to be changed:
First log segment after reset: 000000010000000000000009
NextMultiXactId: 65536
OldestMultiXid: 1
OldestMulti's DB: 0
NextMultiOffset: 52352
NextXID: 2097152
OldestXID: 2297064448
OldestXID's DB: 0
If these values seem acceptable, use -f to force reset.
[postgres@lsl-test1 pg_xact]$ cd ../global/
[postgres@lsl-test1 global]$ ls -l pg_control
-rw-r--r--. 1 postgres postgres 0 May 30 17:36 pg_control
7. 启动数据库
确认控制文件参数无误后加上-f会写入到控制文件里。
[postgres@lsl-test1 global]$ /usr/pgsql-11/bin/pg_resetwal -l 000000010000000000000009 -m 0x10000,0x00001 -O 0xCC80 -x 0x200000 -D /pg/pg11/data -f
pg_resetwal: pg_control exists but is broken or wrong version; ignoring it
Write-ahead log reset
启动数据库。
[postgres@lsl-test1 global]$ /usr/pgsql-11/bin/pg_ctl start -D /pg/pg11/data/
waiting for server to start....2019-05-30 22:42:50.946 CST [2471] LOG: listening on IPv6 address "::1", port 5432
2019-05-30 22:42:50.946 CST [2471] LOG: listening on IPv4 address "127.0.0.1", port 5432
2019-05-30 22:42:50.949 CST [2471] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2019-05-30 22:42:50.961 CST [2471] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2019-05-30 22:42:50.987 CST [2471] LOG: redirecting log output to logging collector process
2019-05-30 22:42:50.987 CST [2471] HINT: Future log output will appear in directory "log".
done
server started
8. 查看测试数据是否正常,然后插入新数据看数据库是否可用。
[postgres@lsl-test1 global]$ psql
psql (11.2)
Type "help" for help.
postgres=# select min(oid),max(oid),count(*) from lsl_oid1 ;
min | max | count
-------+--------+--------
16400 | 116399 | 100000
(1 row)
postgres=# insert into lsl_oid1 select generate_series(100001,200000);
INSERT 0 100000
postgres=# select min(oid),max(oid),count(*) from lsl_oid1 ;
min | max | count
-------+--------+--------
16384 | 116399 | 200000
(1 row)
数据库可以正常访问。
至此,本文关于Oracle和PostgreSQL的控制文件内容介绍到这里。