Unimodaly Ingest

Unimodaly Ingest

Auto-convert multimodal data into ML-ready datasets

5.0
1 review

6 followers

Unimodaly Ingest is the world’s first truly unified data-ingestion CLI for machine learning. It automatically detects text, image, audio and tabular files, then validates, samples and augments them into a single, schema-validated dataset ready for training.
Unimodaly Ingest gallery image
Unimodaly Ingest gallery image
Unimodaly Ingest gallery image
Unimodaly Ingest gallery image
Free
Launch Team

What do you think? …

Le'Andre Nash
Maker
📌
Unimodaly Ingest is the world’s first truly unified data-ingestion CLI for machine learning. It automatically detects text, image, audio and tabular files, then validates, samples and augments them into a single, schema-validated dataset ready for training. With built-in metadata extraction and support for JSON/JSONL/CSV exports, you’ll cut your dataset-prep time from hours to minutes. Open-source, cross-platform and extensible, it’s ideal for data engineers, researchers and AI startups everywhere. Key Features Multi-modal Data Detection: Auto-detects text, image, audio & tabular formats. Schema Validation: Plug in custom or default JSON schemas to enforce data quality. Data Augmentation: Synonym swaps in text, flips/rotations in images, noise in audio, sampling in tables. Flexible Sampling: Control dataset size with simple ratios (e.g. 0.8 for 80% sampling) Multiple Outputs: Export as JSON, JSONL or CSV with rich metadata & feature field. Batch Processing: Scales to large corpora with configurable batch sizes. Config Management: One-click –-init to scaffold a .unimodaly.config.json for pipelines.