This document introduces Dynamic Fine-Tuning (DFT), a novel method designed to enhance the generalization capabilities of Large Language Models (LLMs) during Supervised Fine-Tuning (SFT). The authors present a mathematical analysis that reveals how standard SFT gradients implicitly contain a problematic reward structure akin to reinforcement learning (RL), which limits its effectiveness. DFT addresses this by dynamically re-weighting the objective function with the probability of each token, a simple single-line code change. Extensive experiments on mathematical reasoning benchmarks demonstrate that DFT significantly outperforms traditional SFT and even competes favorably with more complex RL methods in offline settings, offering a more robust and efficient fine-tuning alternative.