Fine-Tuning for Business Leaders

Getting Started

Getting Started

Dec 17, 2024

Fine-tuning. I used to associate the term with tuning an instrument or even a radio. But I digress; we're here to discuss LLMs. As general use of LLMs has increased, so has curiosity about advanced techniques. So in this month's newsletter, we'll demystify fine-tuning and explain how it can save time and money.

1. What is fine-tuning?

In the simplest terms, fine-tuning adjusts a language model to be great at a specific task. When you use ChatGPT/Copilot/Claude/Gemini, you're using a pre-trained, general-purpose model. Think of these as talented athletes. If the athlete tries any sport, they'll play well owing to their general coordination and fitness. But if the athlete wants to become exceptional at one sport, it'll need training. The same is true for LLMs. 

2. When would I need fine-tuning?

Wharton Professor Ethan Mollick likens LLMs to interns. They are talented but can get things wrong. So you review their work and correct any mistakes. But what if you want a task completed repeatedly without error, where your supervision is no longer needed? In this case you would train the intern to become an expert who doesn't make mistakes. This is the situation where fine-tuning is incredibly valuable. 

3. What are some example tasks where fine-tuning is needed?
a) Product Style and Material Classification
  • Task: Tag items in a product catalog with detailed style (e.g., “mid-century modern,” “industrial”) so that customers can search and filter with these tags. For large or seasonal product catalogs, this is an intensive undertaking.

  • Skill: Recognizing text and/or visual cues that determine a product's style.

  • Opportunity: Dramatically reduce time required to tag product catalogs from weeks to hours; improve the number of product tags available to customers.

b) Warranty Claim Processing
  • Task: Help process warranty claims by identifying the type of issue (e.g., repair, replacement, refund) and extract key details like purchase date and product type. This process often requires specialized interpretation of detailed customer inputs and supporting documentation.

  • Skill: Understanding industry-specific language related to product issues.

  • Opportunity: Reduce the effort required to process warranty claims and improve response times for higher customer satisfaction.

c) Classifying Leads
  • Task: Classify leads based on likelihood to convert, using factors like task descriptions, budget, and past interactions.

  • Skill: Analyzing leads to assess key indicators of potential.

  • Opportunity: Streamline the sales motion by identifying valuable leads, improve conversion rates, and optimize resource allocation.

4. How does it work?

Fine-tuning follows 3 basic steps:

It's quite similar to training an intern in that (a) you're providing context and examples to guide the learning and (b) the quantity and volume of examples you provide has a direct impact on performance.

And as in training a human, expertise isn’t necessarily attained after one round of training. Based on performance, you may continue the training with additional examples to further boost accuracy:

In this example, the training data is in “file-abc123” and the pre-trained model is “gpt-4o-mini-2024-07-18”. 

A similar process exists for Anthropic’s Claude models, which are accessible through Amazon's Bedrock service. 

5. Fine-tuning is very expensive, right?

Fine-tuning doesn’t have to break the bank. For just $1.36, we achieved 97% accuracy—up from 60%—by fine-tuning a lead classifier. Here’s how:

  • OpenAI's pricing for fine-tuning GPT 4o-mini is $3 per 1M training tokens

  • Our training, over 3 iterations, used 451,920 tokens

  • $3 x 451,920 / 1,000,000 = $1.36

Even if we needed to use ten or even a hundred times more tokens in fine-tuning, the direct cost would still have been low. A more significant cost is dedicating a skilled AI engineer to complete the fine-tuning; that said, this particular project took only several days for one of our team members.

6. How does fine-tuning differ from RAG?

RAG, short for "Retrieval-Augmented Generation," gives your LLM access to a database from which to give answers. So while fine-tuning gives your LLM a specific skill to perform, RAG gives your LLM information to retrieve. Both techniques help your model attain higher accuracy, and can be used together. 

Examples of tasks that can be done with RAG:

  • Inventory check: Retrieving real-time inventory data to support sales queries

  • Customer support: Answering customer queries using up-to-date policy documentation

7. Could my business benefit from fine-tuning?

If your business relies on repetitive, manual processes handled by staff, a fine-tuned LLM can dramatically enhance efficiency by executing these tasks reliably and quickly. Consider these questions as a starting point:

  • Are employees spending significant time on classification or sorting tasks?

  • Are you delaying database tagging due to the manual effort involved?

Start by identifying one repetitive, manual task in your workflow. Fine-tuning could save hours of effort and let your team focus on high-value work.

Get started

Bring our AI Expertise to
Your Team

Contact Us

Get started

Bring our AI Expertise to
Your Team

Contact Us