YugabyteDB CDC: WAL-based Setup with Gluesync
Source data from YugabyteDB
Prerequisites
To have Gluesync working on your YugabyteDB instance you will need to have:
-
Valid user credentials with permissions to read tables, schema and replication slots from the source database.
To create a valid user for Gluesync on your YugabyteDB database you can run/adapt the following query: |
CREATE USER gluesync SUPERUSER;
ALTER USER gluesync WITH PASSWORD `youdecide`;
Given user gluesync present in the example above is not a mandatory username for Gluesync, you can define whatever user name you’d like.
|
Setup via Web UI
-
Hostname / IP Address: DNS record or IP Address of your server;
-
Port: Optional, defaults to
5433
; -
Database name: Name of your target database;
-
Username: Username with read from source tables;
-
Password: Password belonging to the given username.
Custom properties
-
Load balance: (optional, defaults to
false
) Enable/Disable YugabyteDB’s smart driver load balance feature. The default value isfalse
. -
Additional hosts: (optional, defaults to
null
) Provides to YugabyteDB’s smart driver a list of available hosts at bootstrap time. The default value isnull
(empty list). Use it in the following format:host1:port1,host2:port2
. -
Max polling interval: (optional, defaults to
2
) Is used by Gluesync to wait for new messages incoming on the replication slot if no new message comes. The default value is2
seconds.
Setup via Rest APIs
Here following an example of calling the CoreHub’s Rest API via curl to setup the connection for this Agent.
Connect the agent
curl --location --request PUT 'http://core-hub-ip-address:1717/pipelines/{pipelineId}/agents/{agentId}/config/credentials' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
"hostCredentials": {
"connectionName": "myAgentNickName",
"host": "host-address",
"port": 5433,
"databaseName": "db_name",
"username": "",
"password": "",
"maxConnectionsCount": 100,
"enableTls": true,
"certificatePath": "/myPath/cert.pem"
},
"customHostCredentials": {
"loadBalance": "true"
}
}'
Setup steps
Enabling CDC on the source database
YugabyteDB’s Gluesync CDC agent uses its built-in PostgreSQL Logical Replication, which is enabled by default in YugabyteDB.
Working with Before & After images
Before & after images are a feature that allows Gluesync to track the changes that have occurred in your database and comparing them with their previous values. This enables Gluesync to apply only the changes that have occurred, saving bandwidth and processing time.
In YugabyteDB this feature is called Replica identity which by default is set to CHANGE
, and can be changed to FULL
(supporting in this way before and after image) by following the instructions here at this link.