LLM training data mixture optimization breaks when training pools shift — every prior proxy experiment becomes stale.
Menell] have shown that AI Large Language Models (LLMs) can fail to correctly distinguish between different instruction ...