Setup Xstream into Oracle database multi-tenant

Prerequisites

Oracle database version 11.2.0.4 (11.2g) and above configured as multi-tenant

If you’re looking for support on a single-tenant installation please have a look here at this link.

Setting up for Xstream

alter system set db_recovery_file_dest_size = 5G;
alter system set db_recovery_file_dest = '/opt/oracle/oradata/recovery_area' scope=spfile;
alter system set enable_goldengate_replication=true;
shutdown immediate
startup mount
alter database archivelog;
alter database open;

Let’s debug if the given statements worked correctly by running the following:

archive log list

It should output Database log mode: Archive Mode. If everything goes well we can move forward by issuing the following:

CREATE TABLESPACE xstream_adm_tbs DATAFILE '/opt/oracle/oradata/ORCLCDB/xstream_adm_tbs.dbf' SIZE 25M REUSE AUTOEXTEND ON MAXSIZE UNLIMITED;

Then, again as sys as sysdba into your database, we’re going to create an Xstream admin user and grant permissions, execute:

CREATE USER c##xstrmadmin IDENTIFIED BY xsa DEFAULT TABLESPACE xstream_adm_tbs QUOTA UNLIMITED ON xstream_adm_tbs CONTAINER=ALL

GRANT CREATE SESSION, SET CONTAINER TO c##xstrmadmin CONTAINER=ALL

BEGIN
    DBMS_XSTREAM_AUTH.GRANT_ADMIN_PRIVILEGE(
      grantee                 => 'c##xstrmadmin',
      privilege_type          => 'CAPTURE',
      grant_select_privileges => TRUE,
      container               => 'ALL'
    );
END;

again, this time we’re going to create the c##xstrm user and grant permissions, execute:

CREATE USER c##xstrm IDENTIFIED BY xs DEFAULT TABLESPACE xstream_adm_tbs QUOTA UNLIMITED ON xstream_adm_tbs CONTAINER=ALL;

GRANT CREATE SESSION TO c##xstrm CONTAINER=ALL;
GRANT CREATE TABLE TO c##xstrm CONTAINER=ALL;
GRANT SET CONTAINER TO c##xstrm CONTAINER=ALL;
GRANT SELECT ON V_$DATABASE to c##xstrm CONTAINER=ALL;
GRANT FLASHBACK ANY TABLE TO c##xstrm CONTAINER=ALL;
GRANT SELECT_CATALOG_ROLE TO c##xstrm CONTAINER=ALL;
GRANT EXECUTE_CATALOG_ROLE TO c##xstrm CONTAINER=ALL;

Add support to CDC to your specific tables

Now it’s the turn to add logging capabilities to the tables involved in your replication process. We’re going to alter all tables involved and grant permissions to c##xstrm user, again as sys as sysdba:

-- replace me with your tables list
ALTER TABLE SCHEMA1.MYFIRSTTABLE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
GRANT SELECT, INSERT, UPDATE, DELETE ON SCHEMA1.MYFIRSTTABLE TO c##xstrm;

ALTER TABLE SCHEMA1.MYSECONDTABLE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
GRANT SELECT, INSERT, UPDATE, DELETE ON SCHEMA1.MYSECONDTABLE TO c##xstrm;

ALTER TABLE SCHEMA2.MYTHIRDTABLE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
GRANT SELECT, INSERT, UPDATE, DELETE ON SCHEMA2.MYTHIRDTABLE TO c##xstrm;

ALTER TABLE SCHEMA2.MYFOURTHTABLE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
GRANT SELECT, INSERT, UPDATE, DELETE ON SCHEMA2.MYFOURTHTABLE TO c##xstrm;

Final steps

This final step covers the setup of the Xstream outbound stream. Please keep in mind that:

tables: are the list of tables that you are considering enabling the CDC into, keeping NULL as per the example means that the CDC will be enabled in all the tables present in the schema specified thereafter. You won’t keep that for any production workload/environment;
schemas: the schemas which the tables are belonging to;
server_name: the name of your Xstream server, for a Gluesync standard setup we are going to gsxout;
capture_name: the name of your data capture output, for a Gluesync standard setup we are going to gluesync_capture.

To proceed with the following query statements please make sure you connect as c##xstrmadmin user you created earlier.

As a c##xstrmadmin admin user run the following query making the changes according to your needs (tables and schemas per your requirements):

DECLARE
  tables  DBMS_UTILITY.UNCL_ARRAY;
  schemas DBMS_UTILITY.UNCL_ARRAY;
BEGIN

  -- replace me with your tables list
  tables(1)  := 'SCHEMA1.MYFIRSTTABLE';
  tables(2)  := 'SCHEMA1.MYSECONDTABLE';
  tables(3)  := 'SCHEMA2.MYTHIRDTABLE';
  tables(4)  := 'SCHEMA2.MYFOURTHTABLE';

  -- replace me with your schemas list, specifying schemas in that way will include every table within the schema, specify NULL in case you'd like not to include any specific schema wildcard
  schemas(1) := 'SCHEMA1';

  -- pass NULL to avoid every each table belonging to that schema to be subject to CDC
  schemas(2) := NULL;

  DBMS_XSTREAM_ADM.CREATE_OUTBOUND(
    server_name     => 'gsxout',
    capture_name    => 'gluesync_capture',
    table_names     => tables,
    schema_names    => schemas
  );
END;

BEGIN
  DBMS_CAPTURE_ADM.INCLUDE_EXTRA_ATTRIBUTE(
    capture_name   => 'gluesync_capture',
    attribute_name => 'username'
  );
END;

BEGIN
  DBMS_XSTREAM_ADM.ALTER_OUTBOUND(
    server_name  => 'gsxout',
    connect_user => 'c##xstrm'
  );
END;

That’s it! Happy CDC :-)

An XStream Outbound server belongs to one and only Gluesync Xstream source connector and cannot be shared by multiple Gluesync instances. You will required to configure different XStream Outbound connectors for load balancing attaching each to a different Gluesync instance.

By including every table within a schema you are required to plan enough Oracle resources to sustain the possible load. Otherwise, the capture process will fall into Paused for flow control really often.

Status checks

If you need to check if the setup has been done properly, Oracle gives us a snippet to see if Xstream is up & running. To perform the check you need to run this query:

SELECT CAPTURE_NAME, STATE FROM V_$XSTREAM_CAPTURE;

this query will output the list of outbound server(s) running with their current queried status.

To check if there is any relevant alert instead, you can run this query:

SELECT * FROM DBA_OUTSTANDING_ALERTS;

Basic configuration example

This video tutorial will guide you through the basic configuration steps for setting up Gluesync for Oracle database and Couchbase, using the Xstream APIs

The SQL to NoSQL module can be customized by using a configuration file, in JSON format. The file name to use must be specified as a parameter when launching the app, with the -f or --file tokens. The file should be composed of the union of the common configuration files (see here Installation steps) and source/destination specific configuration:

{
  ...
  "oracle": {
    "cdbName": "oracle-contaner-db-name",
    "pdbName": "oracle-pluggable-db-name",
    "xstreamServerName": "xstream-outbound-server-name",
    "connectionType": "SID",
    "tableKeys": {
      "articles": ["id1", "id2"],
      "drivers": ["driver_unique_id"],
      "customers": [
          "customer_name",
          "customer_surname",
          "customer_age",
          "customer_id"
        ]
    }
  }
}

Oracle-specific configurations are listed under the oracle property.

cdbName (optional): name of the container database of your scope, if running on single-tenant leave empty;
pdbName (optional): name of the pluggable database that contains the data you need to replicate, if running on single-tenant leave empty;
xstreamServerName: defaults to GSXOUT, the name of the Xstream outbound server used by Gluesync to detect table changes;
connectionType (optional): connection type used to connect to Oracle. Available types are: SID or SERVICE_NAME. Defaults to SID.
tableKeys: is the object that represents the key-value pair list of tables and its columns that compose the clustered (compound) index or the column Gluesync will use as a primary key. To read more about this feature we suggest you have a look at the following link.

Source core parameter sourceChangeRetention cannot be used here in this source connector. Everything has to be managed using the Xstream APIs, altering the capture process. See Checkpoint Retention Time for more information regarding this configuration.

Troubleshooting

Here is the list of common errors.

ORA-01031: Insufficient privileges: means that you’ve executed the DBMS_XSTREAM_ADM.CREATE_OUTBOUND steps with the wrong user (maybe when still logged in as sys as sysdba). Delete it at redo the steps following the instructions above.
ORACLE CAPTURE STATE PAUSED FOR FLOW CONTROL: LCR records are hitting the capture process way too much than what can be sustained by the current resource allocation and records are then spilled to disk instead of being directly processed. For this and other common Xstream-related errors please consider taking a look at this link: Oracle streams errors.

Useful Xstream administration commands

Altering the capture process

In case you need to add a new table/schema to the capture process you can do it lively just by editing the following snippet.

DECLARE
  tables  DBMS_UTILITY.UNCL_ARRAY;
  schemas DBMS_UTILITY.UNCL_ARRAY;
BEGIN
  tables(1)  := 'SCHEMA1.ARTICLES';
  schemas(1) := '';
  DBMS_XSTREAM_ADM.ALTER_OUTBOUND(
    server_name     => 'GSXOUT',
    table_names     => tables,
    schema_names    => schemas,
    add             => TRUE
  );
END;

Setting the add field to FALSE will remove the tables passed in the table_names field (in this case the SCHEMA1.ARTICLES table is added)

Consider setting the schema to an empty list (schemas(1) := '';) otherwise all the tables of the declared schema will be added.

Skip/add columns by applying rules

To limit the amount of resources consumed by the capture process, reducing latencies and avoiding records being spilled to disk, consider applying filtering rules to your capture process to discard unwanted table columns to become part of the replication process even if those are then discarded by Gluesync later in the pipeline.

DECLARE
   cols DBMS_UTILITY.LNAME_ARRAY;
BEGIN
     cols(1) := 'COLUMN1';
     cols(2) := 'COLUMN2';
   DBMS_XSTREAM_ADM.ADD_SUBSET_OUTBOUND_RULES(
     server_name => 'GSXOUT',
     table_name => 'SCHEMA1.ARTICLES',
     keep => FALSE,
     column_list => cols);
END;

Also in this case the keep field is used to define what to keep and what not.

For further details, I refer you to the official documentation which you can find here: Oracle DBMS_STREAM_ADM documentation.

Looking for data modeling features or other options?

For more detailed configuration options, including the ability to perform data modeling, please have a look at the dedicated Data modeling section when sourcing from an RDBMS.