Generate Better Synthetic Datasets with a "User" LLM
User LLM + Qwen3 to generate fully synthetic dialogues
Most guides on synthetic data start from the assistant’s point of view. You prompt an instruct model with a “persona,” ask for user messages, and let it role-play both sides of a conversation. That’s convenient, but there’s a hidden mismatch: instruct models are fine-tuned and aligned to be helpful assistants, not to behave like users. When you ask them to generate “user” turns, they tend to speak like assistants in disguise, too cooperative, too formal, too on-task. This skews the distribution of user intents, errors, and edge cases, which then flows straight into your synthetic dataset.
A better approach is to split responsibilities. Keep a standard instruct model for the assistant side, and introduce a second model that is fine-tuned specifically to act like a user. Think of this “User LLM” as a generator of realistic user goals, constraints, hesitations, and mistakes. It can ask incomplete questions, follow odd preferences, change its mind, and produce the kind of messy inputs assistants see in the wild. Pairing the two models produces richer dialogs and, in turn, more faithful training data for downstream tasks like intent classification, tool-use planning, and multi-turn guidance.
In this article, we’ll run an assistant-tuned instruct model alongside a User LLM, the one recently introduced by Microsoft, and have them converse to produce dialog-style synthetic datasets. If you have two GPUs, vLLM makes this straightforward: load both models as separate engines and stream turns between them. Many of us don’t have that headroom, though, so we’ll plan for a single consumer GPU setup.
Here is my notebook showing how to generate synthetic dialogues with two LLMs:
Each part of the code is explained below.


