Quantcast
Channel: GoldenGate – Oracle DBA – Tips and Techniques
Viewing all articles
Browse latest Browse all 80

Oracle Goldengate Tutorial 8 – Filtering and Mapping data

$
0
0

Oracle GoldenGate not only provides us a replication solution that is Oracle version independent as well as platform independent, but we can also use it to do data transformation and data manipulation between the source and the target.

So we can use GoldenGate when the source and database database differ in table structure as well as an ETL tool in a Datawarehouse type environment.

We will discuss below two examples to demonstrate this feature – column mapping and filtering of data.

In example 1, we will filter the records that are extracted on the source and applied on the target – only rows where the JOB column value equals ‘MANAGER” in the MYEMP table will be considered for extraction.

In example 2, we will deal with a case where the table structure is different between the source database and the target database and see how column mapping is performed in such cases.

Example 1

Initial load of all rows which match the filter from source to target. The target database MYEMP table will only be populated with rows from the EMP table where filter criteria of JOB=’MANAGER’ is met.

On Source

GGSCI (redhat346.localdomain) 4> add extract myload1, sourceistable
EXTRACT added.

GGSCI (redhat346.localdomain) 5> edit params myload1

EXTRACT myload1
USERID ggs_owner, PASSWORD ggs_owner
RMTHOST devu007, MGRPORT 7809
RMTTASK replicat, GROUP myload1
TABLE scott.myemp, FILTER (@STRFIND (job, “MANAGER”) > 0);

On Target

GGSCI (devu007) 2> add replicat myload1, specialrun
REPLICAT added.

GGSCI (devu007) 3> edit params myload1

“/u01/oracle/software/goldengate/dirprm/myload1.prm” [New file]
REPLICAT myload1
USERID ggs_owner, PASSWORD ggs_owner
ASSUMETARGETDEFS
MAP scott.myemp, TARGET sh.myemp;

On Source – start the initial load extract

GGSCI (redhat346.localdomain) 6> start extract myload1

Sending START request to MANAGER …
EXTRACT MYLOAD1 starting

On SOURCE

SQL> select count(*) from myemp;

COUNT(*)
———-
14

SQL> select count(*) from myemp where job=’MANAGER’;

COUNT(*)
———-
9

On TARGET

SQL> select count(*) from myemp where job=’MANAGER’;

COUNT(*)
———-
9

Create an online change extract and replicat group using a Filter

GGSCI (redhat346.localdomain) 10> add extract myload2, tranlog, begin now
EXTRACT added.

GGSCI (redhat346.localdomain) 11> add rmttrail /u01/oracle/software/goldengate/dirdat/bb, extract myload2
RMTTRAIL added.

GGSCI (redhat346.localdomain) 11> edit params myload2

EXTRACT myload2
USERID ggs_owner, PASSWORD ggs_owner
RMTHOST 10.53.200.225, MGRPORT 7809
RMTTRAIL /u01/oracle/software/goldengate/dirdat/bb
TABLE scott.myemp, FILTER (@STRFIND (job, “MANAGER”) > 0);

On Target

GGSCI (devu007) 2> add replicat myload2, exttrail /u01/oracle/software/goldengate/dirdat/bb
REPLICAT added.

GGSCI (devu007) 3> edit params myload2

“/u01/oracle/software/goldengate/dirprm/myload2.prm” [New file]
REPLICAT myload2
ASSUMETARGETDEFS
USERID ggs_owner, PASSWORD ggs_owner
MAP scott.myemp, TARGET sh.myemp;

On Source – start the online extract group

GGSCI (redhat346.localdomain) 13> start extract myload2

Sending START request to MANAGER …
EXTRACT MYLOAD2 starting

GGSCI (redhat346.localdomain) 14> info extract myload2

EXTRACT MYLOAD2 Last Started 2010-02-23 11:04 Status RUNNING
Checkpoint Lag 00:27:39 (updated 00:00:08 ago)
Log Read Checkpoint Oracle Redo Logs
2010-02-23 10:36:51 Seqno 214, RBA 103988

On Target

GGSCI (devu007) 4> start replicat myload2

Sending START request to MANAGER …
REPLICAT MYLOAD2 starting

GGSCI (devu007) 5> info replicat myload2

REPLICAT MYLOAD2 Last Started 2010-02-23 11:05 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:08 ago)
Log Read Checkpoint File /u01/oracle/software/goldengate/dirdat/bb000000
First Record RBA 989

On Source we now insert two rows into the MYEMP table – one which has the JOB value of ‘MANAGER’ and the other row which has the job value of ‘SALESMAN’


On SOURCE

SQL> INSERT INTO MYEMP
2 (empno,ename,job,sal)
3 VALUES
4 (1234,’GAVIN’,’MANAGER‘,10000);

1 row created.

SQL> commit;

Commit complete.

SQL> INSERT INTO MYEMP
2 (empno,ename,job,sal)
3 VALUES
4 (1235,’BOB’,’SALESMAN‘,1000);

1 row created.

SQL> commit;

Commit complete.

SQL> select count(*) from myemp;
COUNT(*)
———-
16

SQL> select count(*) from myemp where job=’MANAGER’;

COUNT(*)
———-
10

On Target, we will see that even though two rows have been inserted into the source MYEMP table, on the target MYEMP table only one row is inserted because the filter has been applied which only includes the rows where the JOB value equals ‘MANAGER’.

SQL> select count(*) from myemp;

COUNT(*)
———-
10

Example 2 – source and target table differ in column structure

In the source MYEMP table we have a column named SAL whereas on the target, the same MYEMP table has the column defined as SALARY.

Create a definitions file on the source using DEFGEN utility and then copy that definitions file to the target system

GGSCI (redhat346.localdomain) > EDIT PARAMS defgen

DEFSFILE /u01/oracle/ggs/dirsql/myemp.sql
USERID ggs_owner, PASSWORD ggs_owner
TABLE scott.myemp;

[oracle@redhat346 ggs]$ ./defgen paramfile /u01/oracle/ggs/dirprm/defgen.prm

***********************************************************************
Oracle GoldenGate Table Definition Generator for Oracle
Version 10.4.0.19 Build 002
Linux, x64, 64bit (optimized), Oracle 11 on Sep 18 2009 00:09:13

Copyright (C) 1995, 2009, Oracle and/or its affiliates. All rights reserved.

Starting at 2010-02-23 11:22:17
***********************************************************************

Operating System Version:
Linux
Version #1 SMP Wed Dec 17 11:41:38 EST 2008, Release 2.6.18-128.el5
Node: redhat346.localdomain
Machine: x86_64
soft limit hard limit
Address Space Size : unlimited unlimited
Heap Size : unlimited unlimited
File Size : unlimited unlimited
CPU Time : unlimited unlimited

Process id: 14175

***********************************************************************
** Running with the following parameters **
***********************************************************************
DEFSFILE /u01/oracle/ggs/dirsql/myemp.sql
USERID ggs_owner, PASSWORD *********
TABLE scott.myemp;
Retrieving definition for SCOTT.MYEMP

Definitions generated for 1 tables in /u01/oracle/ggs/dirsql/myemp.sql

If we were to try and run the replicat process on the target without copying the definitions file, we will see an error as shown below which pertains to the fact that the columns in the source and target database are different and GoldenGate is not able to resolve that.

2010-02-23 11:31:07 GGS WARNING 218 Aborted grouped transaction on ‘SH.MYEMP’, Database error 904 (ORA-00904: “SAL”: invalid identifier).

2010-02-23 11:31:07 GGS WARNING 218 SQL error 904 mapping SCOTT.MYEMP to SH.MYEMP OCI Error ORA-00904: “SAL”: invalid identifier (status = 904), SQL .

We then ftp the definitions file from the source to the target system – in this case to the dirsql directory located in the top level GoldenGate installed software directory

We now go and make a change to the original replicat parameter file and change the parameter ASSUMEDEFS to SOURCEDEFS which provides GoldenGate with the location of the definitions file.

The other parameter which is included is the COLMAP parameter which tells us how the column mapping has been performed. The ‘USEDEFAULTS’ keyword denotes that all the other columns in both tables are identical except for the columns SAL and SALARY which differ in both tables and now we are mapping the SAL columsn in source to the SALARY column on the target.

REPLICAT myload2
SOURCEDEFS /u01/oracle/software/goldengate/dirsql/myemp.sql
USERID ggs_owner, PASSWORD ggs_owner
MAP scott.myemp, TARGET sh.myemp,
COLMAP (usedefaults,
salary = sal);

We now go and start the originall replicat process myload2 which had abended because of the column mismatch (which has now been corrected via the parameter change) and we see that the process now is running without any error.

now go and start the process which had failed after table modification

GGSCI (devu007) 2> info replicat myload2

REPLICAT MYLOAD2 Last Started 2010-02-23 11:05 Status ABENDED
Checkpoint Lag 00:00:03 (updated 00:11:44 ago)
Log Read Checkpoint File /u01/oracle/software/goldengate/dirdat/bb000000
2010-02-23 11:31:03.999504 RBA 1225

GGSCI (devu007) 3> start replicat myload2

Sending START request to MANAGER …
REPLICAT MYLOAD2 starting

GGSCI (devu007) 4> info replicat myload2

REPLICAT MYLOAD2 Last Started 2010-02-23 11:43 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:03 ago)
Log Read Checkpoint File /u01/oracle/software/goldengate/dirdat/bb000000
2010-02-23 11:31:03.999504 RBA 1461

Coming Next! – Monitoring the GoldenGate environment …..


Viewing all articles
Browse latest Browse all 80

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>