Getting started guide

Introduction

First of all, we’d like to thank you for your interest in Gluesync! We hope you will find it useful and easy to use.

In this page we will guide you through the setup process, starting by ensuring that the operating environment is sound for the execution of the software, up to seeing the first entity being ported from the source to the target database.

Gluesync in a nutshell

Gluesync works by first performing a bulk copy of the source data, unless this step has been explicitly disabled, and then maintaining destination data up-to-date by intercepting any modifications on the source database and replicating them to the destination data implementing a technique called Change Data Capture (CDC). This is accomplished with the best technology available on the source side; e.g. change streams, change data capture facilities, triggers, etc. - due to the variation in the source change data capture technology, the behavior and timings on the replication of the updates to the destination database depend on the specific connection technology and adequate considerations on sizing must be performed.

Accessing the support portal

Should you have any further questions don’t hesitate to contact our support team. Please take a look at our dedicated page to get to know more about our support portal and policies.

Accessing the documentation

You can find detailed documentation on the configuration and the behavior of Gluesync here around on the other pages of this documentation website.

Checking the environment

To ensure the installation of Gluesync is smooth and successful, the following checklists are provided to be used in assessing the readiness of the infrastructure at your site.

Preparation of your target/source database(s)

For your database(s) environment to get ready for Gluesync please follow the following tips per each of the compatible databases.

Couchbase Server / Couchbase Capella

For Gluesync to work within your Couchbase database deployment please run the following tool: docs.couchbase.com/server/current/sdk/sdk-doctor.html (SDK Doctor) so we all can get an understanding of any relevant issue or warning that can become a show stopper that needs to be addressed before starting your journey with Gluesync.

MongoDB

Have at your disposal a valid and proven working MongoDB connection string and test it against your MongoDB cluster / Atlas using mongosh or MongoDB Compass.

Aerospike

Using the asbench utility from within the VM / instance that will be dedicated to Gluesync is a good way to spot any potential issue that may affect your replication journey with us.

MS SQL Server / MS SQL on Azure (SQL Azure)

Make your instance is reachable from the MS SQL Server mapped port (usually defaults to 1433) and given user credentials have read/write permissions to manage the database you targeting as a CDC source (Gluesync will need it to automatically manage the enablement of the change tracking feature).

Oracle database

For Xstream users make sure enough resources are provisioned as disk IOPS and the Stream pool has a proper size to process your transaction logs (this is relevant in environments where a high number of SCN values get generated). Also, make sure you’re running either a multi-tenant or a single-tenant setup before proceeding. A full guide about setup steps can be found here at the following link: molo17.com/gluesync/docs/gluesync/v1.4/sql-to-nosql-source-oracle-database/installation-steps.html

For users that are going to use our auditing structure via Triggers make sure your user has permission to create triggers and create the necessary table structure for the auditing purposes.

In both cases, Oracle instant clients should be able to get access to your database.

Other databases: good best practices

For all the other databases not listed here (but not limited to just those), please consider having proper mapped port(s) opened and reachable from within the host where you’re considering installing Gluesync as well as valid user access credentials for both reading and writing to the database source/target. You can test the same using a SQL client or any sample application that makes use of vendor SDK / JDBC driver connection. Make sure you’ve double-checked the above-mentioned steps before going further on deploying Gluesync.

Runtime options

Gluesync can both be run as a container on a simple host machine (e.g. with Docker installed, or another compatible hosting infrastructure), or in a Kubernetes cluster. Appropriate instructions and configuration templates are provided for both cases.

Single Container checklist

You have prepared a running environment following the recommended resource allocation by MOLO17;
You have received access tokens to the MOLO17 container registry;
Your running environment is able to reach MOLO17 container registry to download the container image;
You have prepared access credentials for the source and destination databases - if access requires certificates, suitable files are available;
Your running environment can reach both source and destination databases.

Kubernetes cluster checklist

You have prepared a running environment following recommended resource allocation by MOLO17 - at least one node must be able to support the relevant pod;
You have received access tokens to the MOLO17 container registry;
Your running environment is able to reach MOLO17 container registry to download the container image;
You have prepared access credentials for the source and destination databases - if access requires certificates, suitable files are available;
Your running environment can reach both source and destination databases;

Preparing the configuration

To successfully start a Gluesync instance, a config.json file and a license key file are required. Both should be made available inside the container at a specified location, which must also be configured as an Environment Variable.

The license key file

If not already received, you should ask MOLO17 for a suitable license file: this is bound to your source and target database types and will have an expiration date. Remember to ask for a new license file before that date! The license is supplied as a binary file: make sure that programs do not modify the content of the file or it may result invalid when loaded.

The configuration file

The configuration file will be prepared together with the team from MOLO17, to ensure the best possible outcome. To prepare for this phase, the following information can be found in advance:

source/target database network address or connection string;
source/target database credentials or encryption certificates;
schema names for the relevant tables, bucket names, and naming conventions;
table names;
column names, if mapping or column exclusion is to be performed;
data types for columns that will be subject to data modeling;
size of the data currently in the source tables;

Accessing source/target databases

Ensure network reachability is available among the host where the Gluesync container will be run and the database systems where the source and target databases reside. This step also includes making sure that any DNS names can be resolved from the container host, or that appropriate IP addresses are available and network routing is configured accordingly.

Data modeling and entity names

While Gluesync can "just copy" data from one source to the target, sometimes some data transformation (data modeling) is required. For this step to be smooth, please prepare adequate documentation about column types, names, and sizes for the tables that are going to be part of a data modeling operation.

Manifest/compose templates

MOLO17 can provide you with templates for Kubernetes manifest files or docker-compose configurations. Our support team is also available to assist you in customizing these templates to best fit your actual installation needs. We recommend considering the Kustomize toolset to easily combine several manifests/secrets in one single file. Usually, the credentials to access the container images for your Gluesync configuration should be stored in a configuration entry or file that is referenced by the Kubernetes manifest or the docker-compose file. Ensure these credentials are set and valid.

Checklist

At the end of this stage, you should:

have a license key file available and, if required by the execution environment, ready to be loaded as a secret named resource;
have a configuration file available and, if required by the execution environment, ready to be loaded as a secret named resource;
have a Kubernetes manifest, or a docker-compose file, with configuration entries that can bind the aforementioned files (license key and configuration file) to the running Gluesync instance.

Running Gluesync for the first time

You can now start the docker-compose tool or apply the Kubernetes manifest, and check the logs of the instantiated container.

Starting Gluesync

Starting Gluesync as a stand-alone container

To start a Gluesync container using docker please find below the appropriate command line. Make sure to update any relevant parameters, such as:

the volume mount for the config directory
the config file/license key file path inside the mounted directory
the correct path (PRODUCT_RELEASE_PATH), image name (SOURCE-TO-TARGET) for the source-target pair you need and version tag (VERSION).

docker run -v $PWD/config:/opt/app/config \
-e CONFIG_FILE=/opt/app/config/config.json \
-e LICENCE_KEY=/opt/app/config/gs-licence.dat \
molo17com/PRODUCT_RELEASE_PATH:SOURCE-TO-TARGET-VERSION

As an example, if you’re looking to pull and launch an Oracle to MongoDB replication, the resulting command will look like this:

docker run -v $PWD/config:/opt/app/config \
-e CONFIG_FILE=/opt/app/config/config.json \
-e LICENCE_KEY=/opt/app/config/gs-licence.dat \
molo17com/gluesync-sql-to-nosql:oracle-to-mongodb-1.4.20

For more information please refer to the specific sections of each component you are looking to set up, if you’d like to learn about docker & docker-compose usage in Gluesync please visit the following dedicated link instead.

Starting Gluesync as a Kubernetes pod

To start Gluesync as a Kubernetes pod you should apply the configured manifest file.

For more information please refer to the official documentation at this link.

Common errors at the first startup

At the first start, these are a series of errors that can be encountered.

Errors for a license not found or invalid.

The error will look like these:

[Example]

The license expired on Wed Mar 01 11:37:31 CET 2023. Please contact sales@molo17.com.
The Token\'s Signature resulted as invalid when verified using the Algorithm: SHA256withRSA

Invalid configuration file errors.

In this case, the errors could be different such as missing mandatory fields or wrong field values.

Connection or permission errors.

When you first start Gluesync it will configure the various components necessary for data synchronization. Two types of errors may appear at this stage. The first is the one concerning a connection problem. This can be of a timeout or network type and can depend on various factors such as the network configuration of the Gluesync machine or those of the source or target database. The second one concerns user permissions for both the source database and the target database.

Useful logs at each startup

At each start the logs will show the following useful information:

Gluesync release version

The Gluesync release version will be in the following format: SQL to NoSQL – version 1.X.X

List of component setups

Logs will show the list of component setups that will be executed. A component setup is a single isolated component that takes part in the replication suite. Multiple components run concurrently while others are interdependent, like for instance if there’s a need to perform an initial copy from the source dataset or not.

[Example]

GluesyncSqlToNoSql INFO - ComponentSetup to run [DataModelingValidationComponentSetup, CouchbaseComponentSetup, CheckStatePreservationTableComponentSetup, EnableChangeTrackingComponentSetup, CheckMigrationCheckpointTableSetup, TableMigrationComponentSetup]

Start transactions

Logs will enumerate all start transaction numbers for each table as below since Gluesync persists per-transaction checkpoints providing the ability to resume from the last committed and acknowledged transaction (strong consistency).

[Example]

 Resuming from transaction 124 for entity Articles
 Resuming from transaction 124 for entity Orders
 Resuming from transaction 124 for entity Customers

Getting help through the Support portal

Looking for help? Please check out this dedicated page about the support policies and the details regarding the MOLO17 Support Portal.