Docker compose

The application is shipped as a set of Docker images, listed as services under the docker-compose.yaml file. The deployment includes a reverse proxy to manage traffic and security for your Gluesync services.

Looking at how to get a Gluesync docker-compose template yaml file? Get a Gluesync trial and receive a pre-filled Gluesync docker-compose.yaml template straight to your mailbox.

Pre-flight checks

To ensure a smooth experience in running Gluesync docker containers within docker-compose, please walk through the following checklist to ensure your environment is ready.

The environment can host the Gluesync platform as per the recommended specification;
The environment can reach the Internet, with particular requirements for the MOLO17 container registry at https://registry.hub.docker.com and sub-paths;
The environment can reach both source and destination databases on the relevant ports/protocols;
You have already read the Gluesync architecture chapter to understand how Gluesync works and how it is composed;

Getting started

Following you’ll learn how a standard docker-compose.yaml file for Gluesync is composed, how to build/customize your own and how some of the Docker compose functionalities come to help to achieve your target goals.

A basic deployment for Gluesync is based out of at least three components:

CoreHub;
Source Agent;
Target Agent.

With the minimum set of components you’ll be able to setup one pipeline and replicate as many entities (tables, collections…) from source to target.

In addition, for enhanced functionality, modern deployments typically include:

Traefik reverse proxy (automatically included in the default templates);
Chronos module (for scheduling and managing periodic tasks);
As many additional agents as you need for specific databases.

To run more pipelines you’ll need to add more source and target agents to your deployment.

Necessary files

There are two necessary files you are required to have within your deployment to get Gluesync up and running via Docker compose:

A valid license file, in .dat format, to be mounted to every container;
A bootstrap file, named bootstrap-core-hub.json, which holds the unique secret exchanged via our consensus protocol to verify agent’s signature other than providing end-to-end encryption. This is unique per each deployment.

Not having a valid license file? You can always get a trial license by filling up our online web form or contact our sales department directly by writing an email to: Contact sales.

Bootstrap file

As mentioned this file serves as a secret to encrypt comunications within your CoreHubs' deployment and the other components that you intent to plug-in.

A boostrap-core-hub.json file looks like the following:

{
  "apiTokenSecret": "gs-trial"
}

This is usually enough for start trying out Gluesync. By changing the secret value you are turning on Encryption and all the Enterprise features available within the platform. It requires a valid EE license in place for it to work.

Core Components

CoreHub

The CoreHub service is the central management system of your Gluesync deployment. In modern deployments with the reverse proxy, it looks like the following:

gluesync-core-hub:
  restart: "unless-stopped"
  image: molo17/gluesync-core-hub:beta
  labels:
  - "traefik.enable=true"
  - "traefik.http.routers.corehub.entrypoints=websecure"
  - "traefik.http.routers.corehub.rule=PathPrefix(`/`)"
  - "traefik.http.routers.corehub.tls=true"
  - "traefik.http.services.corehub.loadbalancer.server.port=1717"
  - "traefik.http.services.corehub.loadbalancer.server.scheme=https"
  - "traefik.http.routers.corehub.service=corehub"
  - "traefik.http.services.corehub.loadbalancer.passhostheader=true"
  # deploy:
  #   resources:
  #     limits:
  #       cpus: "3.0"
  #       memory: 3.0G
  environment:
  - ssl_enabled=true
  - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
  volumes:
  - ./gs-license.dat:/opt/gluesync/data/gs-license.dat:ro
  - ./bootstrap-core-hub.json:/opt/gluesync/data/bootstrap-core-hub.json:ro
  - ./logback.xml:/opt/gluesync/data/logback.xml:ro
  - ./security-config.json:/opt/gluesync/data/security-config.json:ro
  - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks:ro
  - ./gluesync-core-hub:/opt/gluesync/data
  - ./gluesync-core-hub-logs:/opt/gluesync/logs

When using the Traefik reverse proxy, the direct port forwarding (ports: - 1717:1717) is no longer needed as Traefik handles the routing. Core Hub requires the environment variable ssl_enabled=true for secure communication and the labels enable integration with the Traefik proxy.

Mounted files within the Gluesync installation folder:

License file: the Gluesync license file gs-license.dat, to learn more about click here;
Logback xml file: the logging framework config file, to learn more about click here;
SSL Keystore file: the gluesync.com.jks file containing the keystore used to encrypt the TLS comunication between components and clients;
Security config: security-config.json file used to instruct Gluesync on how to open the keystore cointain the TLS certificate info.

Source Agents

Gluesync supports various source agents for different database systems. Here’s an example of a MariaDB CDC source agent:

gluesync-mariadb-cdc-source:
  restart: "unless-stopped"
  image: molo17/gluesync-mariadb-cdc:latest
  # deploy:
  #   resources:
  #     limits:
  #       cpus: "2.0"
  #       memory: 2.0G
  environment:
  - type=source
  - ssl_enabled=true
  - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
    # Time zone: defaults to UTC, you can change it to match yours (https://docs.diladele.com/docker/timezones.html)
    # - TZ: "Etc/UTC"
  # comment that if you don't require the agent to connect to a locally hosted DB (via your host's localhost)
  extra_hosts:
  - "host.docker.internal:host-gateway"
  volumes:
  - ./gs-license.dat:/opt/gluesync/data/gs-license.dat:ro
  - ./logback.xml:/opt/gluesync/data/logback.xml:ro
  - ./security-config.json:/opt/gluesync/data/security-config.json:ro
  - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks:ro
  - ./gluesync-mariadb-cdc-source:/opt/gluesync/data
  - ./gluesync-source-logs:/opt/gluesync/logs

Source Agents require the environment variable type to be set as source to indicate that this Agent should act as a Source within your deployment. The ssl_enabled=true environment variable enables secure communication with the CoreHub.

Mounted files within the Gluesync installation folder:

License file: the Gluesync license file gs-license.dat, to learn more about click here;
Logback xml file: the logging framework config file, to learn more about click here;
SSL Keystore file: the gluesync.com.jks file containing the keystore used to encrypt the TLS comunication between components and clients;
Security config: security-config.json file used to instruct Gluesync on how to open the keystore cointain the TLS certificate info.

Target Agents

Gluesync supports various target agents. Here’s an example of a Couchbase target agent:

gluesync-couchbase-target:
  restart: "unless-stopped"
  image: molo17/gluesync-couchbase:latest
  # deploy:
  #   resources:
  #     limits:
  #       cpus: "2.0"
  #       memory: 2.0G
  environment:
  - type=target
  - ssl_enabled=true
  - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
    # Time zone: defaults to UTC, you can change it to match yours (https://docs.diladele.com/docker/timezones.html)
    # - TZ: "Etc/UTC"
  # comment that if you don't require the agent to connect to a locally hosted DB (via your host's localhost)
  extra_hosts:
  - "host.docker.internal:host-gateway"
  volumes:
  - ./gs-license.dat:/opt/gluesync/data/gs-license.dat:ro
  - ./logback.xml:/opt/gluesync/data/logback.xml:ro
  - ./security-config.json:/opt/gluesync/data/security-config.json:ro
  - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks:ro
  - ./gluesync-couchbase:/opt/gluesync/data
  - ./gluesync-couchbase-logs:/opt/gluesync/logs

Target Agents require the environment variable type to be set as target to indicate that this Agent should act as a Target within your deployment.

Mounted files within the Gluesync installation folder:

License file: the Gluesync license file gs-license.dat, to learn more about click here;
Logback xml file: the logging framework config file, to learn more about click here;
SSL Keystore file: the gluesync.com.jks file containing the keystore used to encrypt the TLS comunication between components and clients;
Security config: security-config.json file used to instruct Gluesync on how to open the keystore cointain the TLS certificate info.

Specialized Modules

Chronos Module

The Chronos module is responsible for scheduling and executing time-based tasks within the Gluesync ecosystem. This module enables automation of repetitive tasks and scheduled operations.

gluesync-chronos:
  restart: "unless-stopped"
  image: molo17/gluesync-chronos:latest
  # deploy:
  #   resources:
  #     limits:
  #       cpus: "2.0"
  #       memory: 2.0G
  environment:
  - ssl_enabled=true
  - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
  volumes:
  - ./gs-license.dat:/opt/gluesync/data/gs-license.dat:ro
  - ./logback.xml:/opt/gluesync/data/logback.xml:ro
  - ./security-config.json:/opt/gluesync/data/security-config.json:ro
  - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks:ro
  - ./gluesync-chronos:/opt/gluesync/data
  - ./gluesync-chronos-logs:/opt/gluesync/logs

The Chronos module allows you to:

Schedule periodic database synchronization tasks
Set up data maintenance operations
Configure automated health checks and system maintenance
Define complex scheduling patterns for your data pipelines

Traefik Reverse Proxy

Gluesync deployments include a Traefik reverse proxy by default. This provides several benefits:

TLS termination and HTTPS support
Load balancing between multiple instances
Path-based routing to different services
Enhanced security through request filtering

The Traefik configuration is typically managed through labels on the services that need to be exposed, as seen in the CoreHub service configuration.

For more details about Traefik configuration options, visit the official Traefik documentation.

Adding Additional Agents to Your Deployment

For more complex data integration scenarios, you’ll often need to add multiple source and target agents to your Gluesync deployment. This section explains how to properly configure and add these agents.

Naming Convention for Multiple Agents

When adding multiple instances of the same agent type (e.g., multiple MSSQL source agents), use these best practices:

Unique service names: Each agent service in your docker-compose file should have a unique name that describes its purpose
Use separate volume paths: Each agent should have its own data directory to avoid conflicts

Here’s an example of adding multiple MSSQL source agents with different names:

gluesync-mssql-cdc-source-primary:
  restart: "unless-stopped"
  image: molo17/gluesync-mssql-cdc:latest
  environment:
  - type=source
  - ssl_enabled=true
  - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
  volumes:
  - ./gs-license.dat:/opt/gluesync/data/gs-license.dat:ro
  - ./logback.xml:/opt/gluesync/data/logback.xml:ro
  - ./security-config.json:/opt/gluesync/data/security-config.json:ro
  - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks:ro
  - ./gluesync-mssql-primary:/opt/gluesync/data
  - ./gluesync-mssql-primary-logs:/opt/gluesync/logs

gluesync-mssql-cdc-source-secondary:
  restart: "unless-stopped"
  image: molo17/gluesync-mssql-cdc:latest
  environment:
  - type=source
  - ssl_enabled=true
  - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
  volumes:
  - ./gs-license.dat:/opt/gluesync/data/gs-license.dat:ro
  - ./logback.xml:/opt/gluesync/data/logback.xml:ro
  - ./security-config.json:/opt/gluesync/data/security-config.json:ro
  - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks:ro
  - ./gluesync-mssql-secondary:/opt/gluesync/data
  - ./gluesync-mssql-secondary-logs:/opt/gluesync/logs

Complete Deployment Example

Here’s a complete example of a Gluesync deployment with multiple source and target agents, the Chronos module, and the Traefik reverse proxy. This configuration demonstrates a comprehensive setup that you can use as a starting point for your own deployments.

services:
  # Traefik reverse proxy configuration
  reverse-proxy:
    restart: "unless-stopped"
    image: traefik:v3.3
    command:
    - "--api.insecure=true"   # remove it in case of proper signed TLS certs
    - "--providers.docker"
    - "--providers.docker.exposedbydefault=false"
    - "--entrypoints.web.address=:80"
    - "--entrypoints.websecure.address=:443"
    ports:
    - "80:80"
    - "443:443"
    volumes:
    - /var/run/docker.sock:/var/run/docker.sock
    - ./traefik.yml:/etc/traefik/traefik.yml:ro
    - ./certs.yml:/etc/traefik/certs.yml:ro
    - ./gluesync-cert.pem:/etc/traefik/gluesync-cert.pem:ro
    - ./gluesync-key.pem:/etc/traefik/gluesync-key.pem:ro

  # Core Hub - Central management component
  gluesync-core-hub:
    restart: "unless-stopped"
    image: molo17/gluesync-core-hub:latest
    labels:
    - "traefik.enable=true"
    - "traefik.http.routers.corehub.entrypoints=websecure"
    - "traefik.http.routers.corehub.rule=PathPrefix(`/`)"
    - "traefik.http.routers.corehub.tls=true"
    - "traefik.http.services.corehub.loadbalancer.server.port=1717"
    - "traefik.http.services.corehub.loadbalancer.server.scheme=https"
    - "traefik.http.routers.corehub.service=corehub"
    - "traefik.http.services.corehub.loadbalancer.passhostheader=true"
    environment:
    - ssl_enabled=true
    - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
    volumes:
    - ./gs-license.dat:/opt/gluesync/data/gs-license.dat:ro
    - ./bootstrap-core-hub.json:/opt/gluesync/data/bootstrap-core-hub.json:ro
    - ./logback.xml:/opt/gluesync/data/logback.xml:ro
    - ./security-config.json:/opt/gluesync/data/security-config.json:ro
    - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks:ro
    - ./gluesync-core-hub:/opt/gluesync/data
    - ./gluesync-core-hub-logs:/opt/gluesync/logs

  # Chronos module - Task scheduling
  gluesync-chronos:
    restart: "unless-stopped"
    image: molo17/gluesync-chronos:latest
    labels:
    - "traefik.enable=true"
    - "traefik.http.routers.chronos.rule=PathPrefix(`/chronos/`) || PathPrefix(`/api/`) || PathPrefix(`/chronos/api/`)"
    - "traefik.http.routers.chronos.entrypoints=websecure"
    - "traefik.http.routers.chronos.tls=true"
    - "traefik.http.services.chronos.loadbalancer.server.port=1717"
    - "traefik.http.services.chronos.loadbalancer.server.scheme=https"
    - "traefik.http.routers.chronos.middlewares=chronos@docker"
    - "traefik.http.routers.chronos.service=chronos"
    - "traefik.http.middlewares.chronos.replacepathregex.regex=^/chronos/api/(.*)"
    - "traefik.http.middlewares.chronos.replacepathregex.replacement=/api/$$1"
    - "traefik.http.routers.chronos.middlewares=chronos@docker"
    - "traefik.http.services.chronos.loadbalancer.passhostheader=true"
    environment:
    - SSL_ENABLED=True
    - SSL_SKIP_VERIFY=True
    - TIMEZONE=Europe/Rome   # optionally hardcode timezone value in IANA format (es. Europe/Rome, from IANA database)
    volumes:
    - ./gs-license.dat:/opt/gluesync/data/gs-license.dat:ro
    - ./security-config.json:/opt/gluesync/data/security-config.json:ro
    - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks:ro
    - ./chronos-data:/app/data
    depends_on:
      gluesync-core-hub:
        condition: service_started

  # Multiple source agents
  gluesync-mariadb-cdc-source:
    restart: "unless-stopped"
    image: molo17/gluesync-mariadb-cdc:latest
    environment:
    - type=source
    - ssl_enabled=true
    - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
    volumes:
    - ./gs-license.dat:/opt/gluesync/data/gs-license.dat:ro
    - ./logback.xml:/opt/gluesync/data/logback.xml:ro
    - ./security-config.json:/opt/gluesync/data/security-config.json:ro
    - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks:ro
    - ./gluesync-mariadb-cdc-source:/opt/gluesync/data
    - ./gluesync-mariadb-cdc-source-logs:/opt/gluesync/logs

  gluesync-mssql-cdc-source:
    restart: "unless-stopped"
    image: molo17/gluesync-mssql-cdc:latest
    environment:
    - type=source
    - ssl_enabled=true
    - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
    volumes:
    - ./gs-license.dat:/opt/gluesync/data/gs-license.dat:ro
    - ./logback.xml:/opt/gluesync/data/logback.xml:ro
    - ./security-config.json:/opt/gluesync/data/security-config.json:ro
    - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks:ro
    - ./gluesync-mssql-cdc-source:/opt/gluesync/data
    - ./gluesync-mssql-cdc-source-logs:/opt/gluesync/logs

  # Multiple target agents
  gluesync-couchbase-target:
    restart: "unless-stopped"
    image: molo17/gluesync-couchbase:latest
    environment:
    - type=target
    - ssl_enabled=true
    - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
    volumes:
    - ./gs-license.dat:/opt/gluesync/data/gs-license.dat:ro
    - ./logback.xml:/opt/gluesync/data/logback.xml:ro
    - ./security-config.json:/opt/gluesync/data/security-config.json:ro
    - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks:ro
    - ./gluesync-couchbase-target:/opt/gluesync/data
    - ./gluesync-couchbase-target-logs:/opt/gluesync/logs

  gluesync-mongodb-target:
    restart: "unless-stopped"
    image: molo17/gluesync-mongodb:latest
    environment:
    - type=target
    - ssl_enabled=true
    - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
    volumes:
    - ./gs-license.dat:/opt/gluesync/data/gs-license.dat:ro
    - ./logback.xml:/opt/gluesync/data/logback.xml:ro
    - ./security-config.json:/opt/gluesync/data/security-config.json:ro
    - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks:ro
    - ./gluesync-mongodb-target:/opt/gluesync/data
    - ./gluesync-mongodb-target-logs:/opt/gluesync/logs

You can download the complete example from our repository and modify it to fit your specific needs.

Best Practices for Multi-Agent Deployments

When working with multiple agents, consider these best practices:

Resource allocation: Uncomment and configure the deploy: resources: section for each agent to ensure proper resource distribution
Logging configuration: Consider using separate log folders for each agent to make troubleshooting easier
Database connectivity: Ensure each agent can connect to its respective database (note the extra_hosts configuration for connecting to localhost)
Volume mounting: Use read-only (:ro) flags for configuration files that shouldn’t be modified by the containers

Image tags & versioning

You can browse our public Docker Hub repo to find the most recent image tag for each of our images.

Resources: provisions and limits

Resource limits and provisioning are the perfect way to manage the resource sharing between each running container within your deployment.

By default each of our template comes with this values left commented, leaving the management of the resource sharing task to the docker engine itself; while this approach is good for POC and trials this is not recommended for production usage of the platform.

This two settings can be adjusted per each container by following this example:

gluesync-xyz:
  image: molo17/gluesync-zxy:version.tag
  # ... other params ...
  deploy:
      resources:
      # set container limits (CPU cores & RAM)
      limits:
          cpus: "2.0"
          memory: 2.0G
      # set container reservations (CPU cores & RAM)
      reservations:
          cpus: "1.0"
          memory: 0.5G
  # ... other params ...

Limits: use limits to limit the maximim number of resources (CPU cores & RAM) your container should use. Reservations: reserve and assign certain amount of resources (CPU cores & RAM) to your container to use when needed, else will be shared with others.

You are free to mix and play with both ranges.

Do not over provision limits. Keep in mind that if 4 cores are available at hosts level, limiting one container to 4 will involve going out of cores for the others in case of peak requests.

Sizing

Don’t know how much you should provision for your environment? Our professional services team is available to perform a proper sizing exercise by filling our sizing questionnaire.

We reccoment considering a proper sizing of your environment in order not to experience unexpected behaviours especially under certain peak loads.

Change Time zone

By default the container runs under the UTC time zone, used for standard time reference. You’ll then see console logs being served in UTC by default and that might be required to be changed to ease the tasks of troubleshooting and log ingestion.

To change the default time zone reference you can use this snippet, to be placed within each service:

gluesync-xyz:
  image: molo17/gluesync-zxy:version.tag
  # ... other params ...
  environment:
    - TZ: "Etc/UTC"
  # ... other params ...

For your convenience you can pick up the proper time zone by looking at the list available at the following link: time zones list.

Time zone can so be changed by specifying a different value insted of the Etc/UTC present in the given example, like America/Detroit.

Persistency

By default our template kits (like the Gluesync Trial Kit) come with peristency, even though they are meant for a trial and disposal of the environment they rely on the physical storage provided by the Docker engine where otherwiser a docker compose down command would tear down and clean up the container status, removing ephimeral volumes all together to the instances.

To enable/disable persistency you can comment out the few lines from our template docker-compose.yaml file or add the following like in the given example below:

gluesync-xyz:
  image: molo17/gluesync-xyz:version.tag
  # ... other params ...
  volumes:
  # ... other params ...
    - ./gluesync-xyz:/opt/gluesync/data
  # ... other params ...

As you can see from the above example, a phyisical volume is mapped to the folder where the actual application data is located. By removing this line entry the volumes belonging to this container will be mamaged by the Docker engine as ephimeral.

Networking

Networking in Docker requires some knowledge of computer networks but we try here following to give you some basic commands that are often coming to be handy when working with Docker containers and having to deal with the reachability of host’s hosted DBs or intra-hosts container reachability.

Port forwarding

To allow a specific container port to be forwarded like you do with your own physical/software firewall you can use this snippet for your container instances requiring this:

gluesync-xyz:
  image: molo17/gluesync-xyz:version.tag
  # ... other params ...
  ports:
    - 1717:1717
  # ... other params ...

Port can be mapped one by one or in range, the first param is the exposed external port while the param to the right is the actual internal port exposed by the container which you cannot change.

In this case we’re exposing externally the port 1717 otherwise filtered by firewall rules and only reachable within the same Docker network (available so only between the containers making part of the same Docker network).

Host networking

Sometimes you have to connect Gluesync to a locally running database within your Host’s environment and not directly linked to the same Docker network.

There are two ways you can actually achieve to connect a Gluesync agent with a service (database, streaming service, …) running locally on the same host that is hosting the Docker engine:

A) Using host.docker.internal as a DNS name, this substitute the localhost DNS name so requests you would have done by pointing to localhost will then instead resolved to the host hosting the Docker environment, where your services are actually running: localhost would have instead, with all the reasons, resolved DNS queries within the container;

gluesync-xyz:
  image: molo17/gluesync-xyz:version.tag
  # ... other params ...
  extra_hosts:
      - "host.docker.internal:host-gateway"
  # ... other params ...

After having done that you can use host.docker.internal as a DNS name within the Gluesync agent’s setup wizard to directly point to the host which is hosting both your local DB and Gluesync, what you intended would have been localhost.

B) Using network_mode: host, this will bridge the container’s network adapter with the host’s one the result is so as having the container running locally on the same network as your host machine.

gluesync-xyz:
  image: molo17/gluesync-xyz:version.tag
  # ... other params ...
  network_mode: host
  # ... other params ...

Which of the two shall I choose then? We suggest to take the first approach as the simple and most effective way to locally test in your local lab environment and experiment with the platform, while the second one could be seen as an approach to literraly forward all the traffic outside the Docker network boundaries making you able to connect platform components deployed elsewhere in other different hosts.

Running

Make sure you have Docker up and running, then issue the following terminal commands:

$ docker compose up -d

Logs: manage and export

Ability to get the set of logs redacted by the docker engine itself comes really to be handy when talking with the support.

Docker logs can be easily exported by issuing the following command:

$ docker compose logs > logs.txt

This will export the full set of logs belonging to the Docker compose deployment in one single file named logs.txt than can be then zipped and uploaded to our support channel.

Custom registry

Should you need to use a custom registry, e.g. because your environment cannot reach docker hub’s container registry, please contact us to arrange a specific setup. You can use the provided Access Tokens to locally mirror the container.