mercredi 30 décembre 2015

GoldenGate : ERROR OGG-01705 Input checkpoint position xxx is greater than then size of the file yyyy



A. Introduction :

Due to the crash of the server hosting the database, the Datapump extract was ABENDED.
In general, Datapump Extracts and Replicats read the current trail file data from the disk cache instead from the physical file when the read checkpoint is very
close to the current EOF of the trail file it reads. In other words, when it keeps up with the trail. There is therefore a chance that the Datapump or Replicat will
checkpoint an RBA which is still in cache. If there is a disk or system outage or for similar issues, the data in the cache may be lost. This occurs because the
system doesn't have a chance to flush the data to the disk.


B. Version & Status :



Version:


Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

Oracle GoldenGate Command Interpreter for Oracle
Version 11.2.1.0.4


Status:

EXTRACT RUNNING EXT_MGR 00:00:00 00:00:09
EXTRACT ABENDED EXT_PMP 66:31:22 02:12:53

in the ggserr.log, wa can find :
2015-12-29 11:33:27  ERROR   OGG-01705  Input checkpoint position 106570861 for input trail file '/apps/trail/DBPROD/ab00125' is greater than the size of the file (106570847).  Please consult Oracle Knowledge Management Doc ID 1138409.1. for instructions.


C. Solution:

I have used the Metalink note 1138409.1 and the option 2.
The goal is to compute the new RBA using the following:

New datapump / Replicat RBA =    (Reader's too-big checkpoint RBA) (Step 1)
        -  (Actual size of datapump / replicat trail file (seqno X)) (Step 2)
        +  First record in the new trail file (after the restart abend)  (Step 3)   
  
The size of the current trail and the next are :

-rw-rw-rw-    1 oracle   dba       106570847 Dec 27 01:14 /apps/trail/DBPROD/ab00125   << where the checkpoint is pointing
-rw-rw-rw-    1 oracle   dba       106570836 Dec 27 04:33 /apps/trail/DBPROD/ab00126  << the next available trail file.

Step 1 :Reader's too-big checkpoint RBA:

Step 1:
  We know the Reader's too-big checkpoint RBA (Step 1)  from the details of the datapump:
 
 GGSCI  >
   info EXT_PMP
   EXTRACT    EXT_PMP    Last Started 2015-12-27 13:52   Status ABENDED
   Checkpoint Lag       25:06:45 (updated 00:28:04 ago)
   Log Read Checkpoint  File /apps/trail/DBPROD/ab00125
        2015-12-26 12:46:07.000000  RBA 106570861


 Here the actual trail file size, End of File (EOF), is 106570847 where the replicat read checkpoint rba is at 106570861.
   
So  Reader's too-big checkpoint RBA = 106570861

Step 2:Actual size of datapump / replicat trail file (seqno X)

ls -l /apps/trail/DBPROD/ab00125:
-rw-rw-rw-    1 oracle   dba       106570847 Dec 27 01:14 /apps/trail/DBPROD/ab00125   << where the checkpoint is pointing
 So Actual size of datapump / replicat trail file = 106570847.

Step 3 : First record in the new trail file (after the restart abend)

Use the logdump utility and go to the next file :
./logdump

Logdump 61 >open /apps/trail/DBPROD/ab00126
Current LogTrail is /apps/trail/DBPROD/ab00126
Logdump 62 >ghdr on
Logdump 63 >ggstoken detail
Logdump 64 >detail data on
Logdump 65 >n



2015/12/27 01:15:56.628.828 FileHeader           Len  1075 RBA 0
Name: *FileHeader*
......
.....
Logdump 66 >n
___________________________________________________________________
Hdr-Ind : E (x45) Partition : . (x00)
UndoFlag : . (x00) BeforeAfter: A (x41)
...
..... RestartAbend         Len     0 RBA 1083
Name:
After  Image:   Partition 0   G  s
Logdump 67 >n
___________________________________________________________________
Hdr-Ind : E (x45) Partition : . (x04)
.....
....  FieldComp            Len   502 RBA 1144
Name: MY_SCHEMA.MY_TABLE
....
Logdump 68 >


So First record in the new trail file (after the restart abend)  = 1144



D : New RBA



New datapump / Replicat RBA = (Reader's too-big checkpoint RBA) - (Actual size of datapump / replicat trail file (seqno X)) + First record in the new trail file (after the restart abend)


1) Reader's too-big checkpoint RBA (A)                             = 106570861 --- step 1
2) Actual size of datapump / replicat trail file (seqno X)     = 106570847 --- step2
3) First record in the new trail file (after the restart abend)  = 1144 --- step3



New datapump / Replicat RBA = 106570861 - 106570847 + 1144 = 1158





E : Restart the Datapump


Ensure that you have a good record at sequence number 126 and RBA 1158 with TransInd x00 or x03.
If so we could have the datapump / replicat altered to trail file sequence number 126 and rba 1158 by using the following command :


Logdump 80 >open /apps/trail/DBPROD/ab00126
Logdump 81 >pos 1158
Reading forward from RBA 1158
Logdump 82 >n
2015/12/26 12:46:10.003.297 FieldComp            Len  1785 RBA 1158
Name: *FileHeader*
 .......
Logdump 82 >n
___________________________________________________________________
....
TransInd   :     .  (x03)     FormatType :     R  (x52)
...

ggsci>
alter extract EXT_PMP, extseqno 126, extrba 1158
Start EXT_PMP