Google Cloud Storage Agent for Gluesync: Overview
Core principles
Objects storage in Google Cloud Storage is capable of providing a flexible, scalable and cost-effective solution for storing large amounts of data in a file format.
Gluesync offers the support to store data coming from supported data sources into Google Cloud Storage buckets in Parquet file format using native Google Cloud SDK.
The files stored in the Google Cloud Storage destination bucket follow the best practices including keyspace support. This means that documents are organized within a folder path structure based on the transaction type (snapshot or changes), table name, year, month, and timestamp.
Support for JSON files remains available as an optional format, allowing users to choose based on their preference. In this case, each document is grouped by the source schema and table name, with individual files named according to their primary key.
Change data capture
This agent does not currently support reading incremental changes from Google Cloud Storage.
Q&A
I am looking to store other file formats like Parquet, CSV, and XML files, is it supported? We’re open to supporting a wider amount of different use cases and that also means different file formats for your object storage needs. Please do not hesitate to reach out to us to let us know your need: we’re more than happy to accommodate your feature request.