Gluesync Architecture

Gluesync’s architecture is built on a flexible, agent-based system centered around the Core Hub, with extensibility through SDKs and modules. This design enables real-time data integration across diverse platforms while ensuring scalability, reliability, and security.

Architecture Overview

The architecture consists of five main components:

Source Agents: Handle data extraction and change capture
Core Hub: Orchestrates data flow and system operations
Target Agents: Manage data loading and transformation
SDKs: Enable developer integration with Core Hub
Modules: Extend platform functionality through SDK-based components

Core Components

Core Hub

The Core Hub is the central orchestrator, featuring:

High Availability
- Core Hub #1 and #2 for redundancy
- Automatic failover support
- Load balancing capabilities
Data Processing
- Aggregation engine
- Denormalization support
- Transaction coordination
Management
- Administration interface
- Monitoring dashboard
- Security controls

Source Agents

Agent Type	Capabilities
NoSQL CDC	* Natively integrated NoSQL CDC * Real-time change capture * Event-based replication
RDBMS	* Transaction-log based CDC * SQL database support * Real-time monitoring
Data Lakes	* AWS S3 & S3-like support * Dell ECS integration * Azure blob storage support

Agent Type

Capabilities

NoSQL CDC

* Natively integrated NoSQL CDC * Real-time change capture * Event-based replication

RDBMS

* Transaction-log based CDC * SQL database support * Real-time monitoring

Data Lakes

* AWS S3 & S3-like support * Dell ECS integration * Azure blob storage support

Target Agents

Agent Type	Features
NoSQL	* Native integration * Load balancing * Multi-target support
RDBMS	* JDBC-oriented connections * Transaction pool management * Bulk loading capabilities
Data Lakes & Streaming	* S3 and blob storage support * Kafka integration * Solace messaging support

Agent Type

Features

NoSQL

* Native integration * Load balancing * Multi-target support

RDBMS

* JDBC-oriented connections * Transaction pool management * Bulk loading capabilities

Data Lakes & Streaming

* S3 and blob storage support * Kafka integration * Solace messaging support

Communication Flow

Data Flow Diagram

WebSocket connections enable real-time bidirectional communication
Core Hub processes and transforms data in-flight
Multi-target support allows for parallel data distribution

Source agent caching layer

Overview

Starting from Gluesync 2.1.9, supported source agents include an embedded, persistent caching layer. Instead of streaming changes directly from the source database to the Core Hub, agents first write inbound changes to a local, append-only queue built on top of Chronicle Queue.

At a high level, this caching layer:

Decouples the source database from the Core Hub consumer
Reduces load on the source system by minimizing long-lived cursors and connections
Stores changes durably on disk, allowing for safe pause/resume of pipelines
Provides a replayable history window (configurable retention) for recovery scenarios

Chronicle Queue is a high-performance, append-only log implementation designed for low-latency systems. It stores data in memory-mapped files on disk, organizing entries into sequential segments. This makes it a good fit for Gluesync’s caching layer, where agents continuously append new change events and readers consume them in order without incurring heavy garbage collection or complex locking.

How it fits in the architecture

From an architectural perspective, the Chronicle Queue–based cache sits inside the source agent process:

The source agent ingests changes from the database (e.g. journal, transaction logs, CDC API) and appends them to the local queue.
A separate internal worker within the agent reads from the queue and forwards events to the Core Hub over the usual WebSocket connection.
If the Core Hub or network is temporarily unavailable, the agent continues to cache new changes locally until the connection is restored, then drains the backlog.

This design keeps the Core Hub stateless with respect to source-side buffering while giving operators a predictable and tunable buffer at the edge of each source system.

Benefits

Key benefits of the source agent caching layer include:

Lower source footprint – fewer active connections and reduced dependency on database-side caching artifacts.
Operational resilience – short outages or maintenance windows on the Core Hub side do not force resynchronizations, as changes remain in the agent cache.
Configurable retention – cache retention can be tuned to balance disk usage and recovery window length.
Simplified scaling – agents handle local buffering, allowing Core Hub instances to scale independently for processing and routing (coming soon).

Advanced Features

Data Processing

Aggregation Engine
- Real-time data aggregation
- Custom aggregation rules
- Performance optimization

Denormalization

Smart Denormalization
- Configurable strategies
- Automatic schema mapping
- Performance tuning

Multi-Source Support

Event-based Integration
- Multiple source connections
- Parallel processing
- Consistent ordering

Deployment Options

Flexible Deployment Models

Model	Description	Best For
On-Premises	Full deployment within your infrastructure	High security requirements
Cloud	Deployment on AWS, Azure, or GCP	Scalability and flexibility
Hybrid	Mix of on-premises and cloud components	Balanced approach

Model

Description

Best For

On-Premises

Full deployment within your infrastructure

High security requirements

Cloud

Deployment on AWS, Azure, or GCP

Scalability and flexibility

Hybrid

Mix of on-premises and cloud components

Balanced approach

Security Architecture

Security Measures

Network Security
- TLS encryption
- Secure WebSocket connections
- Network isolation options
Authentication
- API key authentication
- Role-based access control
- Session management
Data Protection
- In-transit encryption
- Secure credential storage
- Audit logging

SDKs and Developer Integration

Open Source SDKs

Gluesync provides open source SDKs that enable developers to build custom integrations and extensions:

SDK	Features
Python SDK	* Core Hub handshake protocol implementation * Authentication and authorization * Event handling and processing * Available at GitLab
Node.js SDK	* JavaScript-based Core Hub integration * Real-time event processing * Promise-based API design * Available at GitLab
Coming Soon	* Java SDK * Kotlin SDK

SDK

Features

Python SDK

* Core Hub handshake protocol implementation * Authentication and authorization * Event handling and processing * Available at GitLab

Node.js SDK

* JavaScript-based Core Hub integration * Real-time event processing * Promise-based API design * Available at GitLab

Coming Soon

* Java SDK * Kotlin SDK

Developer Benefits

Open Platform
- Build custom agents and modules
- Extend Core Hub functionality
- Access Gluesync APIs securely
Integration Support
- Comprehensive documentation
- Reference implementations
- Community-supported examples

Modules

Platform Extensions

Modules are platform extensions built on top of Gluesync SDKs that enhance system capabilities:

Module	Functionality
Chronos	* Advanced scheduling capabilities * Time-based job orchestration * Recurring task management * Available at GitLab
Bootstrapper	* System initialization * Configuration management * Deployment automation * Available at GitLab
Conductor	* Automated agent deployment * Resource allocation policies * Container lifecycle orchestration * See Conductor documentation
Whisperer	* Automated database operations * Multi-database connectivity * Load testing and PillowFight tooling * Available at GitLab
Automator	* Web-based Bootstrapper execution * Graphical configuration management * Cross-platform packaged executable * See Automator documentation
Convert DBMoto Metadata XML	* Syniti metadata.xml conversion * Bootstrapper template generation * Migration workflow guidance * See Conversion guide

Module

Functionality

Chronos

* Advanced scheduling capabilities * Time-based job orchestration * Recurring task management * Available at GitLab

Bootstrapper

* System initialization * Configuration management * Deployment automation * Available at GitLab

Conductor

* Automated agent deployment * Resource allocation policies * Container lifecycle orchestration * See Conductor documentation

Whisperer

* Automated database operations * Multi-database connectivity * Load testing and PillowFight tooling * Available at GitLab

Automator

* Web-based Bootstrapper execution * Graphical configuration management * Cross-platform packaged executable * See Automator documentation

Convert DBMoto Metadata XML

* Syniti metadata.xml conversion * Bootstrapper template generation * Migration workflow guidance * See Conversion guide

Module Architecture

Modules connect to Core Hub through SDK interfaces
Custom modules can extend platform capabilities
Open architecture enables third-party development

Monitoring & Administration

Built-in Monitoring

Real-time metrics collection
Performance monitoring
Resource utilization tracking
Alert management

Administration Tools

Web-based admin interface
REST API access
Configuration management
System health monitoring

For detailed deployment instructions, see our Deployment Guide for Docker Compose and Kubernetes.