Neosync

Neosync

Open source data anonymization platform for Developers

5.0
1 review

102 followers

Neosync is an open source ETL platform with built-in data anonymization and synthetic data tools that developers use to: - anonymize production data so they can safely use it locally - detect & redact sensitive data in LLM prompts - generate synthetic data
Interactive
Neosync gallery image
Neosync gallery image
Neosync gallery image
Neosync gallery image
Neosync gallery image
Neosync gallery image
Neosync gallery image
Free Options
Launch Team / Built With

What do you think? …

Evis Drenova
Hey Product Hunt! I'm Evis and I'm excited to introduce Neosync. Neosync is an open source data platform that helps engineering teams anonymize production data and sync it across their environments so they can safely use it locally. Developers use Neosync to reproduce production bugs locally and have a better developer experience while building and testing their code. Data engineers use Neosync to build and test their data pipelines against anonymized production data and synthetic data without having to wait for access to production data. ML/AI engineers use Neosync to detect and anonymize sensitive data in LLM prompts and to anonymize data before training or fine-tuning models. We started working on Neosync after spending years hand-writing test data that was just never good enough and always needed to be updated as our schema changed. One day, we thought, “what if instead of manually writing and managing this test data, we just anonymized our production data, took a slice of it and used that locally? We can reproduce any data issues we see in prod locally and we won’t have to manually update it every time we add a column.” And Neosync was born! At its core, Neosync does three things: (1) It streams data from a source to one or multiple destination databases/object storage. We never store your sensitive data. (2) While that data is being streamed, we transform it. You define which schemas and tables you want to sync and at the column level and select one of 50+ pre-built transformers or create your own in code that defines how you want to anonymize the data or generate synthetic data. (3) Optionally subset your data based on pretty much any SQL query. We do all of this while handling referential integrity. Whether you have primary keys, foreign keys, unique constraints, circular dependencies (within a table and across tables), sequences and more, Neosync preserves those references without you having to do anything. Today, Neosync can connect to Postgres, Mysql, SQL Server, MongoDB, DynamoDB, S3, GCS and more. We also ship with APIs, a Terraform provider, and a CLI that you can use locally. We’d love your feedback on what we're building and if it could be useful for you. We’re happy to talk through use-cases and answer any questions that you have!
Alex Osmonov
If you're a developer, you know how annoying it is to thoroughly test your changes ("well it works in dev"), deploy your code, and then get pinged about why the latest changes have a bunch of bugs. For our team, this typically happens when our staging & dev databases have poor data that is not representative of the production environment, leading to poor handling of edge cases. Neosync is pretty awesome for improving this & also allows us to avoid to building/maintaining a bunch of annoying ETL jobs. Good luck with the launch @edrev !
Evis Drenova
@alexosmonov thanks, Alex! Glad that you're seeing the value in the platform!
Daniel Farrell
Woah such a cool product! Loved walking through that demo! Who did your guys' branding too? Super crisp product... can't wait to share this with my colleagues!