Create judge-labeled evaluation datasets for fine-tuning and benchmarking conversational AI models
Snowglobe generates high-signal training data from simulated conversations, including judge labels, preference pairs for DPO, and critique-and-revise triples for SFT. Users export clean JSONL ready for training, eliminating manual dataset curation.

