FRAMEWORK FOR FOCUSED TRAINING OF LANGUAGE MODELS AND TECHNIQUES FOR END-TO-END HYPERTUNING OF THE FRAMEWORK

Inventiv.org

August 18, 2025

Invented by Zaremoodi; Poorya, Hoang; Cong Duy Vu, Vu; Duy, Tran; Dai Hoang, Saha; Budhaditya, Bhat; Nagaraj N., Vu; Thanh Tien, Pham; Tuyen Quang, Pocock; Adam Craig, Silverstein; Katherine, Gadde; Srinivasa Phani Kumar, Vishnoi; Vishal, Johnson; Mark Edward, Duong; Thanh Long, Oracle International Corporation

Modern chatbots and digital assistants need to understand people in a real way. But teaching computers to understand language is hard. This article explains a new patent application that brings a smart way to train language models for chatbots and other tools. Let’s break down the story, the science behind it, and what makes this invention special.

Background and Market Context

Every day, more people use instant messaging and chat platforms for help and answers. Companies want to help customers quickly, but hiring lots of support people is expensive. That’s why chatbots—computer programs that can chat with people—are popular. The best chatbots feel natural, like talking to a real person. They need to understand what people ask, even if the words are new or tricky.

But making great chatbots is hard. Developers have to pick the right data, clean it, and choose the right computer models. They often have to try out many models, train them, test them, and see what works best. Training these models from scratch takes lots of time and money. Even when using models other people have already trained, it’s tough to get them to work well for new jobs, like answering health questions or helping with banking.

This is where pre-trained language models come in. These are big computer programs, like BERT or GPT, that have learned a lot about language by reading books or websites. But even these smart models can struggle with special topics, like legal terms or medical words. That’s why many companies want to “fine-tune” these models—teaching them more with new data so they work better for their own needs.

But fine-tuning isn’t simple. If you aren’t careful, the model can get confused or forget things it once knew. Sometimes, two models trained the same way will give very different answers. This makes it hard for companies to trust their chatbot’s answers. The patent application we’re exploring solves this by introducing a smarter way to fine-tune language models, making the process stable, repeatable, and easier for everyone.

Scientific Rationale and Prior Art

To understand the new patent, it helps to know how training language models usually works. Most language models start by learning from huge collections of text. For example, BERT is trained to guess missing words in sentences from Wikipedia. This is called pre-training. After pre-training, developers teach the model to do something specific, like find names in sentences or decide if a review is positive or negative. This is called fine-tuning.

Fine-tuning has problems. Sometimes, changing the model for a new job makes it forget what it knew before. Or, two models trained the same way will perform very differently. This is called “instability.” Another problem: general language models often don’t do well on special tasks unless they see data from that special area.

To help with this, scientists have tried a few things:

Transfer learning: Take a model trained for one job and teach it a related job. This saves time but doesn’t always give great results for special topics.
Adapters: Add small pieces to the model for each new job, so you don’t have to change the whole thing. This saves memory but can be tricky to manage.
Multi-task learning: Train the model to do several jobs at once. This helps it learn more general knowledge but can make training slow and hard to set up.
Hyperparameter tuning: Adjust “knobs” that control how the model learns, like how fast it changes with new data. But finding the best settings is time-consuming and often done by trial and error.

Before this patent, developers had to mix and match these ideas on their own. There wasn’t a clear, repeatable, end-to-end method to make sure the model learned the right things and worked well for new tasks. Companies needed a way to make their chatbots smarter, faster, and more reliable—without hiring teams of experts to guess which settings to use or which extra training jobs would help.

Invention Description and Key Innovations

This patent application introduces a new method for training language models so they’re focused and optimized for a company’s needs. The heart of the idea is a two-step process: focused training and end-to-end hypertuning.

Focused Training means taking a general language model and teaching it to do better in a special area or job. This is done in two ways:

Adaptive focusing: The model is trained further using text from the target area (like medical articles if you want a medical chatbot). Even if this text isn’t labeled, the model learns to “speak the language” of that area.
Behavioral focusing: The model is trained on extra tasks related to the main job (like teaching it to spot important words or moods). These extra jobs are called “auxiliary tasks.”

What’s unique is how these steps are combined. You can do them one after the other (sequential) or at the same time (parallel). In parallel, the model learns from both the new area and the extra tasks at once. This helps it learn better and faster.

End-to-End Hypertuning is the next key innovation. Hypertuning means picking the best “settings” (hyperparameters) and the best extra tasks for training. Instead of guessing, the system tries different combinations in an organized way. It checks how well the model does after each try, then uses this feedback to pick better settings. This process repeats until the model is as good as possible. It’s a bit like baking cookies and tweaking the recipe each time until you get the tastiest batch.

The method lets you:

Pick the best extra training tasks (auxiliary tasks) for your main goal.
Set the best learning “knobs” (hyperparameters) for your data and tasks.
Do all this automatically, with feedback guiding each step.

The patent also includes smart techniques for adapters. These are small add-ons that let the main model stay the same while learning new jobs. The system can create, train, and combine adapters for different tasks. When it’s time to focus on the main job, the system can freeze the old settings and only adjust the new ones. This makes training faster and saves memory.

In practice, the process looks like this:

Start with a pre-trained language model (like BERT).
Gather unlabeled text from your special area (like legal emails).
Optionally, collect labeled data for related extra tasks (like identifying keywords or detecting positive/negative mood).
Set up the training system to try different combinations of extra tasks and hyperparameter settings.
Train the model, check how well it does on your main job, and use that feedback to improve the choices for the next round.
Repeat until you find the best combination. The result is a focused, optimized model ready for real-world use.

This approach is formalized in the patent’s claims. The method can be used for chatbots, digital assistants, or any system that needs to understand language well. It works with different types of models and data, and can be run on cloud servers or local machines.

The biggest benefits are:

Better accuracy, because the model learns exactly what’s needed for your business.
Less guesswork, since the system finds the best settings automatically.
Faster development, because you don’t need to start from scratch or waste time on trial and error.
More reliable chatbots and assistants, able to handle special topics and tricky language.

Conclusion

This patent offers a new, smart way to create language models that really work for your needs. By combining focused training and automatic hypertuning, it takes the guesswork out of building chatbots and digital assistants. Companies can use less time, spend less money, and get better results. This method is set to change how we train and use language models, making them smarter and more helpful than ever before.

Click here https://ppubs.uspto.gov/pubwebapp/ and search 20250218428.

Tags: Amazon Patent Review