A recent interview question from OpenAI:
If an LLM keeps producing excessively verbose answers, how would you correct it?
If an LLM keeps giving overly long answers, I first treat it as a debugging problem: figure out why the model thinks it should be verbose, then fix the easiest causes before touching the training pipeline.
The first thing I check is whether the problem comes from inference settings. High temperature, no stop sequences, a large max-tokens limit, or a system prompt that asks for “thorough explanations” will all push the model to ramble. I also check if the examples in the system prompt or few-shot prompts are long; the model will imitate whatever it sees.
If inference looks fine, I look at the training data. Sometimes the supervised fine-tuning dataset contains mostly long, detailed answers, so the model learns that “good answers = long answers.” Or the reward data used to train the reward model might accidentally prefer longer replies, so the reinforcement learning step pushes the model toward being exhaustive.
To diagnose this, I try simple prompts like “answer briefly in one sentence.” If the model still refuses to be concise, it usually means the alignment process over-rewarded long responses. That’s a sign the issue is inside the training pipeline, not just the prompting.
The easiest fixes start at inference: lower temperature, set a shorter max token limit, add a direct instruction like “respond concisely,” and include one or two short examples in the system prompt. Those alone often solve the issue.
For more on this questions and 24+ questions from Top AI companies on LLMs, check out the full post here