Public vs private LLMs: The pros and cons
Senior Manager of Automatic Speech Recognition.
Tags
Share
Today, most of us are used to the idea of public consumer-grade large language models (LLMs) roaming the vast reaches of the internet, training their algorithms on massive troves of stories, books, Wikipedia pages, and other information.
It’s an incredibly valuable resource—but here’s the problem: Not all of that information is free for the taking.
If you can’t walk out of a bookstore with Sarah Silverman’s book without paying for it, is ChatGPT allowed to flick through its pages online and take what it wants without asking?
And if the answer is no, then maybe that also applies to a pretty large swathe of original content across the internet.
Almost 8,000 writers, including Nora Roberts, Viet Thanh Nguyen, Michael Chabon and Margaret Atwood, recently petitioned OpenAI, Meta and other AI companies to stop using their work without permission.
Their agents and publishers are getting involved as well, as they look into amending writers' contracts to include language that will prohibit unauthorized use of works by LLMs.
Things are also heating up in Hollywood. A key sticking point among the actors and artists who have gone on strike is the concern that AI is going to copy and then replace their work. Union leader Fran Drescher said that AI poses an existential threat to the creative profession.
And it’s not just the authors and actors who are getting upset with the big, data-hungry LLMs.
All of this noise around the threat to privacy is waking up regulators, too.
A growing number of governments around the world are starting to ask questions, with Europe taking a particularly hard look at reining in public LLMs.
The honeymoon seems to be over for public LLMs. The days of running free and largely unnoticed across the internet may be coming to an end as content creators and governments alike take notice and push back.
But not all LLMs are coming under fire—because not all LLMs are the same.
The potential of private LLMs
While public LLMs like ChatGPT draw their immense power from scouring the internet for information, there are so-called private LLMs that operate very differently.
Designed to work on specific data within clearly defined boundaries, these AI models offer a host of advantages over their public cousins—starting with privacy.
Financial information firm Bloomberg provided a perfect example of how this works when it recently built an in-house LLM from scratch.
Called BloombergGPT, the model was trained on a specific set of financial data. This avoided any friction with outside content creators and, so far, hasn’t raised any alarms with regulators.
Finance isn’t the only industry where a private LLM can provide the benefits of generative AI without many of the messy side effects.
Consider healthcare and hospitals. Or even more specifically, consider what a private LLM could do with notes from your doctor.
Every year, a doctor can write thousands of notes about symptoms, drugs, and dosages. Now, imagine having an LLM transform all of that scribbled medical jargon into clear, accessible data that could be mined for insights.
A public LLM couldn’t be unleashed on this patient information any more than it could be allowed to run riot through data from banks, insurance companies, and other organizations that zealously protect customer privacy.
But a private LLM could. Limited to a very specific dataset and designed to respect the strictest privacy regulations, it could harness the enormous power of generative AI to sift through millions of doctors’ notes, extracting valuable information that can then be immediately analyzed for patterns and problems.
Whether we’re talking about finance, healthcare, or any other risk-averse industry or organization, a private LLM can deliver the power of AI while soothing the rising number of concerns about privacy.
And the advantages of a private LLM don’t stop at privacy. There are other benefits too:
Higher accuracy
Private LLMs trained on very specific, carefully vetted datasets are more likely to produce very specific, carefully vetted solutions.
This means a doctor using a private LLM to research bedwetting probably isn’t going to receive Sarah Silverman’s take on the subject.
Private LLMs trained on a defined set of data aren't sullied by dubious information scraped from the dark corners of the internet—meaning they can be designed to provide relevant, factual data that adds clear value.
Fewer made-up “facts”
One of the more interesting developments around the accuracy of public LLMs is that they occasionally make stuff up.
Instead of simply saying, “I don’t know,” LLMs have been known to invent facts out of thin air.
For example, when an early version of ChatGPT was asked: What’s heavier, a pencil or a toaster? it said a pencil. Elsewhere, ChatGPT has been caught fabricating an impressive list of citations and footnotes for research papers that don’t actually exist.
A private or optimized LLM could drastically reduce these occurrences.
More personalization
Another advantage of private LLMs is their ability to tailor replies to the unique needs of their customers.
In Bloomberg’s case, for example, users of their in-house LLM have access to an AI that is fluent in the complex language of the finance industry. So a trader who asks about their P/E is probably going to get the reply they want about price-to-earnings ratios and not about gym class.
This ability to personalize a user’s experience goes beyond word choice. A customized LLM trained on domain-specific data can also be designed to communicate in any style or tone that the user wishes.
For doctors, this could mean an LLM that provides objective text that is stripped of any superfluous adjectives.
For a contact center interacting with thousands of consumers, this could mean an LLM that can understand the nuances of slang and profanity.
The range of possibilities for private LLMs is as wide as the needs for the people using them.
Greater speed and reliability
If you’ve used a consumer-grade LLM before, there’s a good chance you’ve experienced an outage or seen it slow to a crawl.
This isn’t the end of the world if you’re just having fun turning dialogue by Harry Potter into something that sounds like it was written by Hemingway.
But if you’re running a business, outages are going to be a problem.
Outsourcing your generative AI to a public LLM can create a fog of uncertainty when it comes to bugs and bandwidth. Whereas using a dedicated private LLM can give you dependable uptime and ample peace of mind.
More updates
There’s another element of speed that comes into play when discussing public and private LLMs.
Many months can pass before a large public LLM is updated to a newer version. And this delay isn’t all due to training time.
Public LLMs need to ensure that an update hasn't been polluted with bias or other unsavory influences during a model’s foray across the internet. With customers and regulators increasingly sensitive to bias, companies like OpenAI and Google can spend a lot of time kicking the tires before releasing a new version to the public.
Not so with a private LLM. A clean dataset can accelerate the pathway to a clean bill of health.
Less harmful for the environment
Quicker searches and faster updates trigger other benefits as well. Because less time means less energy—and that’s good for both the bottom line and the environment.
One study suggested ChatGPT uses the equivalent of a 16.9 ounce bottle of water for every 20 to 50 questions it answers.
Will your business use AI powered by public or private LLMs?
Like any transformative technology, generative AI is going to kick up a lot of dust as people sort through its many opportunities and challenges.
For businesses that value security and explainability, there’s good news in the shape of private LLMs that can deliver many of the benefits with potentially fewer problems.
See how Dialpad Ai works
From transcribing calls in real time, to providing real-time assists to agents and even helping with QA scoring, Dialpad Ai is powering a variety of features that are designed to help businesses have better conversations—both with customers and and internal team members. Book a demo with our team, or take a self-guided interactive tour of the app first!