Posted inAI & Machine Learning
FastMix: How Tencent Finds Optimal Data Mixture via Gradient Descent
FastMix automates data mixture discovery using gradient descent on a single proxy model, achieving 550x speedup over RegMix for LLM training optimization.








