Static YugabyteDB CDC: WAL-based Setup with Gluesync

Source data from YugabyteDB

Prerequisites

To have Gluesync working on your YugabyteDB instance you will need to have:

  • Valid user credentials with permissions to read tables, schema and replication slots from the source database.

To create a valid user for Gluesync on your YugabyteDB database you can run/adapt the following query:
CREATE USER gluesync SUPERUSER;
ALTER USER gluesync WITH PASSWORD `youdecide`;
Given user gluesync present in the example above is not a mandatory username for Gluesync, you can define whatever user name you’d like.

Setup via Web UI

  • Hostname / IP Address: DNS record or IP Address of your server;

  • Port: Optional, defaults to 5433;

  • Database name: Name of your target database;

  • Username: Username with read from source tables;

  • Password: Password belonging to the given username.

Custom properties

  • Load balance: (optional, defaults to false) Enable/Disable YugabyteDB’s smart driver load balance feature. The default value is false.

  • Additional hosts: (optional, defaults to null) Provides to YugabyteDB’s smart driver a list of available hosts at bootstrap time. The default value is null (empty list). Use it in the following format: host1:port1,host2:port2.

  • Max polling interval: (optional, defaults to 2) Is used by Gluesync to wait for new messages incoming on the replication slot if no new message comes. The default value is 2 seconds.

Specific configuration

  • Tolopogy keys: (optional, defaults to null) Provides to YugabyteDB’s smart driver a list of additional clusters at bootstrap time. The default value is null (empty list). Use it in the following format: cloud1.datacenter1.rack1:1,cloud1.datacenter1.rack2:2.

Setup via Rest APIs

Here following an example of calling the CoreHub’s Rest API via curl to setup the connection for this Agent.

Connect the agent

curl --location --request PUT 'http://core-hub-ip-address:1717/pipelines/{pipelineId}/agents/{agentId}/config/credentials' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
        "hostCredentials": {
        "connectionName": "myAgentNickName",
        "host": "host-address",
        "port": 5433,
        "databaseName": "db_name",
        "username": "",
        "password": "",
        "maxConnectionsCount": 100,
        "enableTls": true,
        "certificatePath": "/myPath/cert.pem"
      },
      "customHostCredentials": {
        "loadBalance": "true"
      }
}'

Setup steps

Enabling CDC on the source database

YugabyteDB’s Gluesync CDC agent uses its built-in PostgreSQL Logical Replication, which is enabled by default in YugabyteDB.

Working with Before & After images

Before & after images are a feature that allows Gluesync to track the changes that have occurred in your database and comparing them with their previous values. This enables Gluesync to apply only the changes that have occurred, saving bandwidth and processing time.

In YugabyteDB this feature is called Replica identity which by default is set to CHANGE, and can be changed to FULL (supporting in this way before and after image) by following the instructions here at this link.