Is Oracle GoldenGate really designed for batch processing or “large” transactions? – not sure what the official Oracle take on this is but I would hazard a guess and say maybe no. Maybe that is something better suited to an ETL type of product like Oracle Data Integrator.
Goldengate considers a transaction to be large if it changes more than 15100 rows in a table (changed in version 12.2. It used to a value of 9500 in earlier versions)
An important parameter enforces how Goldengate applies these “large” transactions. It is called EAGER_SIZE.
In essence for Oracle GoldenGate it means when I see a large number of LCR’s in a transaction, do I start applying them straight away (that I guess is where the “eager” part of the parameter name is derived from) or do I wait for the entire transaction to be committed and only then start applying changes.
This “waiting” seems to serialize the apply process and adds to the apply lag on the target in a big way.
We can see from the test case (2) shown below, the apply lag more than doubled.
To illustrate this let us run a series of tests involving replication with source and target Oracle GoldenGate 12.2 environments located over 3000 KM from each other.
The test involves running a procedure which executes a series of INSERT and DELETE statements on a set of 10 tables. The load procedure generates 200 transactions which are executed in a 30 second period on the source database. These 200 transactions change in total over 2 million rows across the 10 tables.
Test 1) Maximum size of transaction is 10,000 rows
Test 2) Maximum size of transaction is 20,000 rows (EAGER_SIZE default value)
Test 3) Maximum size of transaction is 20,00 rows (EAGER_SIZE increased to 25000)
Apply Lag on the target database:
Test 1) ~ 20 seconds
Test 2) ~ 50 seconds
Test 3) ~ 20 seconds
Test 1
Note the maximum number of rows in a single transaction in this case is 10,000.
This is the code we are using in the procedure to generate the load test.
create or replace procedure sysadm.load_gen IS BEGIN FOR i in 1 .. 10 LOOP delete sysadm.myobjects1; commit; delete sysadm.myobjects2; commit; … … delete sysadm.myobjects10; commit; insert into sysadm.myobjects1 select * from all_objects where rownum < 10001; commit; insert into sysadm.myobjects2 select * from all_objects where rownum < 10001; commit; ….. ….. …. insert into sysadm.myobjects10 select * from all_objects where rownum < 10001; commit; end loop; END; /
When we kick off the load procedure in each of the 3 test cases on the source, we will see that for about 30 seconds all Apply Servers are idle.
So what is happening in this time?
• On source database the Log Mining server mines the redo log files, extract changes in the form of Logical Change Records which are then passed onto the Extract process which then writes then to the GoldenGate trail files.
• Trail files sent by the Extract Pump over the network to target
• Once trail files are received on the target server, the Replicat process will read the trail file and construct Logical Change Records.
• These LCR’s are sent to the target database where the Log Mining server will start various Apply processes – like the Receiver to receive the LCR’s , the Preparer and Co-ordinator which will sort transactions and organize them in terms of Primary and Foreign key dependencies and finally the Apply Server process which applies changes to the database.
Initially we see the Apply Server has started 8 individual processes because we set the PARALLELISM parameter to 8
SERVER_ID STATE TOTAL_MESSAGES_APPLIED ---------- -------------------- ---------------------- 1 IDLE 0 2 IDLE 0 3 IDLE 0 4 IDLE 0 5 IDLE 0 6 IDLE 0 7 IDLE 0 8 IDLE 0
Once the Apply Server detects additional load coming in, it will spawn additional processes on the fly. This is a big advantage of using Integrated Replicat over Classic or Co-ordinated replicat in that it is load aware and we do not have to manually allocate the number of Apply Servers or have to map an Apply Server to a table or set of target tables.
Note after a few seconds the Apply Servers start applying the received change and we now have the 9th Apply processe added to the earlier 8.
SQL> / SERVER_ID STATE TOTAL_MESSAGES_APPLIED ---------- -------------------- ---------------------- 9 INACTIVE 0 1 IDLE 50005 2 IDLE 20002 3 IDLE 30003 4 IDLE 0 5 IDLE 0 6 IDLE 0 7 IDLE 0 8 IDLE 0 9 rows selected. SQL> / SERVER_ID STATE TOTAL_MESSAGES_APPLIED ---------- -------------------- ---------------------- 9 INACTIVE 0 1 IDLE 50005 2 IDLE 20002 3 IDLE 30003 4 IDLE 0 5 IDLE 0 6 IDLE 0 7 IDLE 0 8 IDLE 0 9 rows selected.
From the view V$GG_APPLY_SERVER we can see the state ‘EXECUTE TRANSACTION’ which shows Apply Servers are applying transactions in parallel.
SQL> select server_id,STATE ,TOTAL_MESSAGES_APPLIED from v$gg_apply_server; SERVER_ID STATE TOTAL_MESSAGES_APPLIED ---------- -------------------- ---------------------- 9 INACTIVE 0 1 IDLE 140014 2 EXECUTE TRANSACTION 302634 3 IDLE 270027 4 EXECUTE TRANSACTION 182775 5 IDLE 60006 6 EXECUTE TRANSACTION 130013 7 IDLE 0 8 IDLE 0 SQL> select server_id,STATE ,TOTAL_MESSAGES_APPLIED from v$gg_apply_server; SERVER_ID STATE TOTAL_MESSAGES_APPLIED ---------- -------------------- ---------------------- 9 INACTIVE 0 0 IDLE 0 1 EXECUTE TRANSACTION 187834 2 EXECUTE TRANSACTION 487708 3 IDLE 330033 4 EXECUTE TRANSACTION 537853 5 EXECUTE TRANSACTION 177838 6 EXECUTE TRANSACTION 267948 7 IDLE 0 8 IDLE 0
Finally we see all the servers are idle - TOTAL_MESSAGES_APPLIED are about 2 million which is about equal to the number of rows changed.
Also note an additional (10th) apply server was also started while the Apply Server was applying changes to the target.
SERVER_ID STATE TOTAL_MESSAGES_APPLIED ---------- -------------------- ---------------------- 10 IDLE 0 2 IDLE 360036 3 IDLE 180018 4 IDLE 280028 9 INACTIVE 0 5 IDLE 410041 6 IDLE 200022 1 IDLE 340034 7 IDLE 220022 8 IDLE 10001
Test 2
Now we run the same load test.
While the number of transactions and number of rows being changed remains the same, we have increased the number of rows in a single transaction to 20,000 (from earlier 10,000).
So we change the procedure code as shown below and reduce the number of iterations in the loop from 10 to 5 to keep the volume of rows changed the same as before.
insert into sysadm.myobjects1 select * from all_objects where rownum < 10001; commit;
TO
insert into sysadm.myobjects1 select * from all_objects where rownum < 20001; commit;
Now we can see that at any given time only one Apply server is in a state of Execute Transaction – all the rest are idle or in state of WAIT DEPENDENCY or sometimes we will also see the state WAIT FOR NEXT CHUNK.
If we query the database performance views or Top Activity performance page in OEM or ASH Analytics as shown below, we will see the Wait Event REPL: Apply Dependency showing up.
We can see that the Apply Server process of the Integrated Replicat RBSPRD1 is what is responsible mainly for that particular Wait Event.
&nmsp;
SQL> select server_id,STATE ,TOTAL_MESSAGES_APPLIED from v$gg_apply_server; SERVER_ID STATE TOTAL_MESSAGES_APPLIED ---------- -------------------- ---------------------- 8 IDLE 0 9 IDLE 0 10 IDLE 0 1 WAIT DEPENDENCY 450026 2 EXECUTE TRANSACTION 229333 3 WAIT DEPENDENCY 460025 4 IDLE 340017 5 WAIT DEPENDENCY 220012 6 IDLE 0 7 IDLE 0 SQL> select server_id,STATE ,TOTAL_MESSAGES_APPLIED from v$gg_apply_server; SERVER_ID STATE TOTAL_MESSAGES_APPLIED ---------- -------------------- ---------------------- 8 IDLE 0 9 IDLE 0 10 IDLE 0 1 EXECUTE TRANSACTION 455418 2 WAIT DEPENDENCY 230014 3 WAIT DEPENDENCY 460025 4 IDLE 340017 5 IDLE 240012 6 IDLE 0 7 IDLE 0 SQL> select server_id,STATE ,TOTAL_MESSAGES_APPLIED from v$gg_apply_server; SERVER_ID STATE TOTAL_MESSAGES_APPLIED ---------- -------------------- ---------------------- 8 IDLE 0 9 IDLE 0 10 IDLE 0 1 WAIT DEPENDENCY 470027 2 WAIT DEPENDENCY 230014 3 EXECUTE TRANSACTION 476575 4 IDLE 340017 5 WAIT DEPENDENCY 240013 6 IDLE 0 7 IDLE 0
Test 3
We now run the same load procedure, but we add a new parameter EAGER_SIZE to the replicat parameter file .
Since the size of the biggest transaction is now 20,000 rows we need to set the EAGER_SIZE to a higher value than that.
For example:
DBOPTIONS INTEGRATEDPARAMS(PARALLELISM 8, EAGER_SIZE 25000)
Note that increasing the EAGER_SIZE would put additional memory requirements on the STREAMS_POOL_SIZE.
Now we see that again we have Apply Servers executing transactions in parallel and there are no servers in the state of WAIT DEPENDENCY.
SQL> select server_id,STATE ,TOTAL_MESSAGES_APPLIED from v$gg_apply_server; SERVER_ID STATE TOTAL_MESSAGES_APPLIED ---------- -------------------- ---------------------- 3 EXECUTE TRANSACTION 207829 9 IDLE 0 8 IDLE 0 4 EXECUTE TRANSACTION 0 5 IDLE 0 6 IDLE 0 1 EXECUTE TRANSACTION 227498 7 IDLE 0 2 EXECUTE TRANSACTION 160008 SQL> select server_id,STATE ,TOTAL_MESSAGES_APPLIED from v$gg_apply_server; SERVER_ID STATE TOTAL_MESSAGES_APPLIED ---------- -------------------- ---------------------- 3 EXECUTE TRANSACTION 227717 9 IDLE 0 8 IDLE 0 4 EXECUTE TRANSACTION 67601 5 IDLE 0 6 IDLE 0 1 EXECUTE TRANSACTION 268900 7 IDLE 0 2 EXECUTE TRANSACTION 308590