SQL queries JSON modelling

In this chapter you will understand the principles of data modelling throught SQL query stataments that we have introduced in this release of Gluesync. All of the operations involved in that features are all performed on-the-fly without involving local caches or any form of persistence inside Gluesync, this makes the overall process faster, secure and consistent.

Core principles

You learned about Entities handled by Gluesync throught its configuration file, where up until now each entity declaration you made was intended to point the replication engine to a real database table present into your relational database. With the introduction of this new feature with version 1.3 we removed this constraint and it is now possible to introduce a concept of virtual entity.

Virtual entities

What is a virtual entity? A virtual entity in Gluesync is a concept we introduced with this release, it’s a feature that enable our replication engine to understand and map a SQL query statement to a JSON model that will be the result of the execution of this query against your source relational database.

As per the snipped here following, you can now be able to declare a new key inside the entitiy object using the reserved key called query. This query object takes as a parameter a minified SQL query statement string and uses this against the datasource: this means that Gluesync engine is compatible with the SQL query engine version or language that your relational databases is actually able to support. You don’t have to learn a new SQL query language nor limiting you on specific platform functionalities.

Following a SQL query used for example against a MS SQL relational database:

-- orders list + customer info + order status + addresses for orders only in status == opened
select
    oh.id,
    oh.order_number,
    oh.order_date,
    c.name + ' ' + c.surname as customer,
    c.phone,
    os.status,
    d.first_name,
    ad.street + ' ' + cast(ad.street_number as varchar(max)) + ' ' + ad.city + ' ' + ad.postal_code as address,
    oh.notes as order_notes
from orders_headers oh
         inner join customers c on c.id = oh.customer_id
         inner join orders_status os on os.id = oh.status
         inner join addresses ad on ad.id = oh.address_id
         inner join drivers d on oh.driver_id = d.id
where oh.status = 1;

will result in a virtual entity map that looks like this:

"sourceEntities": {
    ...
    "orders": {
      "query" : "select oh.id, oh.order_number, oh.order_date, c.name + ' ' + c.surname as customer, c.phone, os.status, ad.street + ' ' + cast(ad.street_number as varchar(max)) + ' ' + ad.city + ' ' + ad.postal_code as address, oh.notes as order_notes from orders_headers oh inner join customers c on c.id = oh.customer_id inner join orders_status os on os.id = oh.status inner join addresses ad on ad.id = oh.address_id where oh.status = 1"
    }
  },

As a JSON document result in your NoSQL database you’ll have

{
  "id": 100,
  "order_number": "SO-71828397",
  "order_date": "2021-07-29T23:01:55Z",
  "customer": "Curtis Streets",
  "phone": "+966 443 229 2199",
  "status": "Opened",
  "address": "Rowland 107 Komsomolsk-on-Amur 681008",
  "order_notes": "Networked fault-tolerant solution",
  "type": "orders"
}

Supported SQL commands

SQL query statements offer to DBAs and software developers a wide variaety of possibilities when it comes to query and represent data sourced from tables, colomuns and rows.

Since this initial release of SQL queries JSON modelling feature we focuces on providing a robust and flexible way to represent SQL queries output into the JSON format that you expect to serve as a content to your users that are consuming the APIs you’ve attached to your NoSQL database. Giving this flexibility means that there could be corner cases where the query statement you are using in Gluesync could make the replication process slower than expected or might not support all the aggregation or business logics you are used to apply while quering your data in a relational database.

Here following we have collected all the current supported query statements that are under the suite of integration tests. We’d love to hear from you your feedback for any other not-yet mentioned SQL command or statement or function you manage to test, this will help us to improve the product for the next releases.

SQL statement Gluesync compatibility

SQL statement	Gluesync compatibility
`INNER JOIN` aggregations	✅
`OUTER JOIN` aggregations	✅
`LEFT JOIN` aggregations	✅
`WHERE` clauses	✅
string concat like `foo + ' ' + bar`	✅
`cast(foo as xxx(yyy))` operators	✅
math operators like `*` `+` `-` `/`	✅

INNER JOIN aggregations

✅

OUTER JOIN aggregations

✅

LEFT JOIN aggregations

✅

WHERE clauses

✅

string concat like foo + ' ' + bar

✅

cast(foo as xxx(yyy)) operators

✅

math operators like * + - /

✅

Subqueries are not yet supported in this version of Gluesync, we plan to add the support in the upcoming releases.

Compatibiliy matrix

RDBMS	Gluesync compatibility
Microsoft SQL Server	✅
Oracle Database	✅
PostgreSQL	✅
MariaDB	✅
MySQL	✅
Sybase	⏱ launching soon
DB2	⏱ launching soon

RDBMS

Gluesync compatibility

Microsoft SQL Server

✅

Oracle Database

✅

PostgreSQL

✅

MariaDB

✅

MySQL

✅

Sybase

⏱ launching soon

DB2

⏱ launching soon

As per our product roadmap relational databases are being added on monthly basis so expect to have SQL queries JSON modelling support on more databases soon after the initial launch.

Nested JSON objects

The current support of SQL queries JSON modelling enables you to output your query statement into a flat JSON file that represent the actual result of a normal query output that is represented in a 2 dimension table-columns representation.

An extended support of SQL queries JSON modelling is planned for the upcoming releases in order to enable more complex objects structure, resulting in the ability to use nested JSON objects, JSON arrays… you can follow our blog in order to keep track of each release announcement made available by the product team.