Fine-tuning adapts a pre-trained AI model to a specific task or domain by continuing its training on a targeted, smaller dataset. Instead of training from scratch — which requires massive compute — fine-tuning transfers general capabilities and specializes them, often achieving expert-level performance with thousands, not billions, of examples.

Modern AI models are trained in two phases. Pre-training is the expensive phase: a foundation model is trained on hundreds of billions of tokens from the internet, learning general language patterns, world knowledge, and reasoning capabilities. This can cost millions of dollars in compute. Fine-tuning is the affordable phase: the pre-trained model's weights are updated using a small, curated, task-specific dataset — typically thousands to hundreds of thousands of examples. Because the model already has rich representations from pre-training, fine-tuning can achieve strong task performance with minimal data and compute. Fine-tuning is used when you need consistent output format or style (e.g., always respond as a customer service agent in a specific brand voice), domain-specific accuracy (a medical model trained on clinical notes outperforms a general model on diagnosis coding), or behavior the base model doesn't exhibit (e.g., following a specific structured output schema reliably). Common fine-tuning techniques include full fine-tuning (updating all weights, expensive), LoRA (Low-Rank Adaptation, updating only small adapter matrices, 10–100x cheaper), and RLHF (Reinforcement Learning from Human Feedback, used to align model behavior with human preferences). Fine-tuning is distinct from prompt engineering and RAG — it modifies the model's weights permanently, while the other techniques influence outputs at inference time without changing the underlying model.

What is Fine-Tuning in AI Model Training?