Docker compose

The application is shipped as a set of Docker images, listed as services under the docker-compose.yaml file.

Looking at how to get a Gluesync docker-compose template yaml file? Get a Gluesync trial and receive a pre-filled Gluesync docker-compose.yaml template straight to your mailbox.

Pre-flight checks

To ensure a smooth experience in running Gluesync docker containers within docker-compose, please walk through the following checklist to ensure your environment is ready.

  • The environment can host the Gluesync platform as per the recommended specification;

  • The environment can reach the Internet, with particular requirements for the MOLO17 container registry at https://registry.hub.docker.com and sub-paths;

  • The environment can reach both source and destination databases on the relevant ports/protocols;

  • You have already read the Gluesync architecture chapter to understand how Gluesync works and how it is composed;

Getting started

Following you’ll learn how a standard docker-compose.yaml file for Gluesync is composed, how to build/customize your own and how some of the Docker compose functionalities come to help to achieve your target goals.

A basic deployment for Gluesync is based out of at least three components:

  • CoreHub;

  • Source Agent;

  • Target Agent.

With that minimum set of components you’ll be able to setup one pipeline and replicate as many entities (tables, collections…​) from source to target.

In case you want to run more pipelines you’ll need to add more source and target agents to your deployment.

Necessary files

There are two necessary files you are required to have within your deployment to get Gluesync up and running via Docker compose:

  • A valid license file, in .dat format, to be mounted to every container;

  • A bootstrap file, named bootstrap-core-hub.json, which holds the unique secret exchanged via our consensus protocol to verify agent’s signature other than providing end-to-end encryption. This is unique per each deployment.

Not having a valid license file? You can always get a trial license by filling up our online web form or contact our sales department directly by writing an email to: Contact sales.

Bootstrap file

As mentioned this file serves as a secret to encrypt comunications within your CoreHubs' deployment and the other components that you intent to plug-in.

A boostrap-core-hub.json file looks like the following:

{
  "apiTokenSecret": "gs-trial"
}

This is usually enough for start trying out Gluesync. By changing the secret value you are turning on Encryption and all the Enterprise features available within the platform. It requires a valid EE license in place for it to work.

CoreHub

The CoreHub service part in docker-compose looks like the following:

gluesync-core-hub:
  image: molo17/gluesync-core-hub:2.0.0
  # deploy:
  #   resources:
  #     limits:
  #       cpus: "3.0"
  #       memory: 3.0G
  environment:
    - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
    - type=corehub
  # Time zone: defaults to UTC, you can change it to match yours (https://docs.diladele.com/docker/timezones.html)
  # - TZ: "Etc/UTC"
  ports:
    - 1717:1717
  volumes:
    - ./gs-license.dat:/opt/gluesync/data/gs-license.dat
    - ./bootstrap-core-hub.json:/opt/gluesync/data/bootstrap-core-hub.json
    - ./logback.xml:/opt/gluesync/data/logback.xml
    - ./security-config.json:/opt/gluesync/data/security-config.json
    - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks
    - ./gluesync-core-hub:/opt/gluesync/data
Port forward of port 1717 is necessary to get external access to Gluesync REST APIs as well as Web UI Control plane; also Core Hub requires the environment variable type to be set as corehub to indicate that this component should take the role of Core Hub within your deployment.

Mounted files within the Gluesync installation folder:

  • License file: the Gluesync license file gs-license.dat, to learn more about click here;

  • Logback xml file: the logging framework config file, to learn more about click here;

  • SSL Keystore file: the gluesync.com.jks file containing the keystore used to encrypt the TLS comunication between components and clients;

  • Security config: security-config.json file used to instruct Gluesync on how to open the keystore cointain the TLS certificate info.

Source Agent

gluesync-mysql-cdc:
  image: molo17/gluesync-mysql-cdc:2.0.0
  # deploy:
  #   resources:
  #     limits:
  #       cpus: "2.0"
  #       memory: 2.0G
  environment:
    - type=source
    - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
    # Time zone: defaults to UTC, you can change it to match yours (https://docs.diladele.com/docker/timezones.html)
    # - TZ: "Etc/UTC"
  # comment that out if you require the agent to connect to a locally hosted DB (via your host's localhost)
  # extra_hosts:
  #   - "host.docker.internal:host-gateway"
  volumes:
    - ./gs-license.dat:/opt/gluesync/data/gs-license.dat
    - ./logback.xml:/opt/gluesync/data/logback.xml
    - ./security-config.json:/opt/gluesync/data/security-config.json
    - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks
    - ./gluesync-mysql-cdc:/opt/gluesync/data
Source Agents require the environment variable type to be set as source to indicate that this Agent should act as a Source within your deployment.

Mounted files within the Gluesync installation folder:

  • License file: the Gluesync license file gs-license.dat, to learn more about click here;

  • Logback xml file: the logging framework config file, to learn more about click here;

  • SSL Keystore file: the gluesync.com.jks file containing the keystore used to encrypt the TLS comunication between components and clients;

  • Security config: security-config.json file used to instruct Gluesync on how to open the keystore cointain the TLS certificate info.

Target Agent

gluesync-vertica:
  image: molo17/gluesync-vertica:2.0.0
  # deploy:
  #   resources:
  #     limits:
  #       cpus: "2.0"
  #       memory: 2.0G
  environment:
    - type=target
    - LOG_CONFIG_FILE=/opt/gluesync/data/logback.xml
    # Time zone: defaults to UTC, you can change it to match yours (https://docs.diladele.com/docker/timezones.html)
    # - TZ: "Etc/UTC"
  # comment that out if you require the agent to connect to a locally hosted DB (via your host's localhost)
  # extra_hosts:
  #   - "host.docker.internal:host-gateway"
  volumes:
    - ./gs-license.dat:/opt/gluesync/data/gs-license.dat
    - ./logback.xml:/opt/gluesync/data/logback.xml
    - ./security-config.json:/opt/gluesync/data/security-config.json
    - ./gluesync.com.jks:/opt/gluesync/data/gluesync.com.jks
    - ./gluesync-vertica:/opt/gluesync/data
Target Agents require the environment variable type to be set as target to indicate that this Agent should act as a Target within your deployment.

Mounted files within the Gluesync installation folder:

  • License file: the Gluesync license file gs-license.dat, to learn more about click here;

  • Logback xml file: the logging framework config file, to learn more about click here;

  • SSL Keystore file: the gluesync.com.jks file containing the keystore used to encrypt the TLS comunication between components and clients;

  • Security config: security-config.json file used to instruct Gluesync on how to open the keystore cointain the TLS certificate info.

Image tags & versioning

You can browse our public Docker Hub repo to find the most recent image tag for each of our images.

Resources: provisions and limits

Resource limits and provisioning are the perfect way to manage the resource sharing between each running container within your deployment.

By default each of our template comes with this values left commented, leaving the management of the resource sharing task to the docker engine itself; while this approach is good for POC and trials this is not recommended for production usage of the platform.

This two settings can be adjusted per each container by following this example:

gluesync-xyz:
  image: molo17/gluesync-zxy:version.tag
  # ... other params ...
  deploy:
      resources:
      # set container limits (CPU cores & RAM)
      limits:
          cpus: "2.0"
          memory: 2.0G
      # set container reservations (CPU cores & RAM)
      reservations:
          cpus: "1.0"
          memory: 0.5G
  # ... other params ...

Limits: use limits to limit the maximim number of resources (CPU cores & RAM) your container should use. Reservations: reserve and assign certain amount of resources (CPU cores & RAM) to your container to use when needed, else will be shared with others.

You are free to mix and play with both ranges.
Do not over provision limits. Keep in mind that if 4 cores are available at hosts level, limiting one container to 4 will involve going out of cores for the others in case of peak requests.

Sizing

Don’t know how much you should provision for your environment? Our professional services team is available to perform a proper sizing exercise by filling our sizing questionnaire.

We reccoment considering a proper sizing of your environment in order not to experience unexpected behaviours especially under certain peak loads.

Change Time zone

By default the container runs under the UTC time zone, used for standard time reference. You’ll then see console logs being served in UTC by default and that might be required to be changed to ease the tasks of troubleshooting and log ingestion.

To change the default time zone reference you can use this snippet, to be placed within each service:

gluesync-xyz:
  image: molo17/gluesync-zxy:version.tag
  # ... other params ...
  environment:
    - TZ: "Etc/UTC"
  # ... other params ...

For your convenience you can pick up the proper time zone by looking at the list available at the following link: time zones list.

Time zone can so be changed by specifying a different value insted of the Etc/UTC present in the given example, like America/Detroit.

Persistency

By default our template kits (like the Gluesync Trial Kit) come with peristency, even though they are meant for a trial and disposal of the environment they rely on the physical storage provided by the Docker engine where otherwiser a docker compose down command would tear down and clean up the container status, removing ephimeral volumes all together to the instances.

To enable/disable persistency you can comment out the few lines from our template docker-compose.yaml file or add the following like in the given example below:

gluesync-xyz:
  image: molo17/gluesync-xyz:version.tag
  # ... other params ...
  volumes:
  # ... other params ...
    - ./gluesync-xyz:/opt/gluesync/data
  # ... other params ...

As you can see from the above example, a phyisical volume is mapped to the folder where the actual application data is located. By removing this line entry the volumes belonging to this container will be mamaged by the Docker engine as ephimeral.

Networking

Networking in Docker requires some knowledge of computer networks but we try here following to give you some basic commands that are often coming to be handy when working with Docker containers and having to deal with the reachability of host’s hosted DBs or intra-hosts container reachability.

Port forwarding

To allow a specific container port to be forwarded like you do with your own physical/software firewall you can use this snippet for your container instances requiring this:

gluesync-xyz:
  image: molo17/gluesync-xyz:version.tag
  # ... other params ...
  ports:
    - 1717:1717
  # ... other params ...

Port can be mapped one by one or in range, the first param is the exposed external port while the param to the right is the actual internal port exposed by the container which you cannot change.

In this case we’re exposing externally the port 1717 otherwise filtered by firewall rules and only reachable within the same Docker network (avaible so only between the containers making part of the same Docker network).

Host networking

Sometimes you have to connect Gluesync to a locally running database within your Host’s environment and not directly linked to the same Docker network.

There are two ways you can actually achieve to connect a Gluesync agent with a service (database, streaming service, …​) running locally on the same host that is hosting the Docker engine:

A) Using host.docker.internal as a DNS name, this substitute the localhost DNS name so requests you would have done by pointing to localhost will then instead resolved to the host hosting the Docker environment, where your services are actually running: localhost would have instead, with all the reasons, resolved DNS queries within the container;
gluesync-xyz:
  image: molo17/gluesync-xyz:version.tag
  # ... other params ...
  extra_hosts:
      - "host.docker.internal:host-gateway"
  # ... other params ...

After having done that you can use host.docker.internal as a DNS name within the Gluesync agent’s setup wizard to directly point to the host which is hosting both your local DB and Gluesync, what you intended would have been localhost.

B) Using network_mode: host, this will bridge the container’s network adapter with the host’s one the result is so as having the container running locally on the same network as your host machine.
gluesync-xyz:
  image: molo17/gluesync-xyz:version.tag
  # ... other params ...
  network_mode: host
  # ... other params ...

Which of the two shall I choose then? We suggest to take the first approach as the simple and most effective way to locally test in your local lab environment and experiment with the platform, while the second one could be seen as an approach to literraly forward all the traffic outside the Docker network boundaries making you able to connect platform components deployed elsewhere in other different hosts.

Running

Make sure you have Docker up and running, then issue the following terminal commands:

$ docker compose up -d

Logs: manage and export

Ability to get the set of logs redacted by the docker engine itself comes really to be handy when talking with the support.

Docker logs can be easily exported by issuing the following command:

$ docker compose logs > logs.txt

This will export the full set of logs belonging to the Docker compose deployment in one single file named logs.txt than can be then zipped and uploaded to our support channel.

Custom registry

Should you need to use a custom registry, e.g. because your environment cannot reach docker hub’s container registry, please contact us to arrange a specific setup. You can use the provided Access Tokens to locally mirror the container.