Hi PH! I'm one of the makers of Dataform, and we're excited to show you what we've built.
With Dataform you can:
🛠 Build complex SQL workflows to transform raw data to prepared datasets
💻 Ensure data quality with tests and follow software engineering best practices for data management
🔔 Receive notifications and alerts if things go wrong
🔎 Document your data and discover datasets in a data catalog
Dataform is really two products in one:
First, it's an open source SDK that sits on top SQL that make it easy to define new data tables, re-use code across scripts, test data quality and document your data.
On top of that SDK, we also built an online platform to collaborate as team, develop quickly and not have to maintain infrastructure. In 5 min you can write a query, test it, add it to a schedule, push the changes to GitHub (or submit a PR) and get alerted if one of your jobs fail.
We would love to hear your feedback and answer any questions you might have!
Thanks!
Dataform has changed the way that we process and transform data stored in BigQuery and Redshift warehouses - can't recommend this enough for data teams building out complex SQL-based workflows. I like to think of Dataform as an "IDE for the data warehouse", where teams can leverage best practices from software development (CI/CD, Q/A checks and assertions), ensuring that downstream stakeholders are always consuming data of high quality.
I couldn't recommend Dataform more highly to any data team. Dataform has enabled us to fundamentally change the way we work, giving us the ability to easily schedule and perform quality checks on our data. We can then be sure that our data consumers throughout the business are using the single source of truth we have defined. It is an indispensable part of our stack, sitting between our data warehouse and our self service data tool (we use Looker).