For artificial intelligence (AI) models to return meaningful answers, they have to be trained on vast amounts of data. It took nearly 45 terabytes of text data to train GPT-3, the predecessor to ChatGPT, which trained on two-thirds of the internet.1 2
Some companies are reportedly working on models with more than 1.6trn parameters—the internal variables that a model uses to make predictions, akin to ingredients in a recipe.
That comes at a cost. The size of large language models (LLMs)—AI systems trained on large amounts of data to understand, process and generate human language—means they require vast computational power, storage and energy to operate.
They are also expensive to develop and deploy, limiting how many organisations can afford to use them. Information technology already accounts for 9% of global energy costs, equivalent to around US$300bn in 2023. This cost has increased by up to 60% in the past decade.3
Is bigger necessarily more powerful? The story of human intelligence suggests otherwise. Children are exposed to 100m words by the age of 13 and perform better than chatbots at language, even with only 0.01% of the data.4
Some AI experts are optimistic that small language models (SLMs), trained on smaller amounts of high-quality data, will be cheaper, more accessible and of even higher quality.
Tech companies have been racing to build more extensive and expensive LLMs with text-generating prowess, allowing them to follow instructions, hold conversations, write software code and more.
That gives them a very broad application as chatbots and virtual assistants to help customers, for example, or as fraud detection agents. In healthcare, LLMs are being used to improve patient diagnosis and create more personalised treatment options.
In education, they are creating virtual learning environments, providing detailed answers to queries, and generating practice problems for revision purposes.
In entertainment, LLMs are used to generate personalised recommendations for films, news articles and songs. They can identify user trends and behaviours, helping media organisations to create targeted advertising.
But SLMs are opening up a new frontier in the AI race. They not only leverage the power of LLMs to solve a wide range of complex problems, but also can be fine-tuned to perform specific tasks because they are trained on smaller amounts of specialised data and are less resource-intensive.
Experts believe this could make them cheaper to run, more accurate, and more appropriate for tasks that require less computing power and a lower level of specialised knowledge or expertise.
SLMs also boast an efficiency advantage. LLMs use trillions of parameters, requiring immense computing power, and run on dedicated servers and hardware that demand hundreds of gigabytes of storage.
SLMs use significantly fewer parameters and need fewer resources. A typical SLM needs only around 1-2 GB of storage space. Startups and tech giants alike are now betting on these small but powerful models.
Microsoft’s family of Phi SLMs aims to replicate the performance of LLMs by relying on high-quality data, rather than on huge amounts of data mined from the internet. The result is a compact model that packs a punch, executing complex calculations and tasks.
In April, the company debuted the smallest and first variation of its Phi-3 family of LLMs—the Phi-3-mini—to allow companies on smaller budgets to experiment with the technology.
The model works best on smartphones and laptops, using only 3.8bn parameters, compared with an estimated 1.76trn for GPT-4, but performs better than rivals twice its size, according to Microsoft.
Given their size, SLMs can be used on devices such as smartphones and sensors that are not connected to the cloud. Tech companies hope this could democratise access to AI.
Whether offering agricultural advice, decoding medical data for speedier diagnoses or performing translation and educational transcriptions, such models are poised to connect global communities—from farms to hospitals to classrooms.
They may also help large organisations such as financial institutions to optimise operations and manage risks more effectively. With banks facing a host of new compliance requirements, SLMs can support data analysis, interpret complex regulatory language and automate compliance processes.
LLMs are growing bigger and more powerful. Increasingly sophisticated models are pushing boundaries by performing tasks they have not been trained to do, a phenomenon that could pose security and ethical risks. Nonetheless, they are better suited than their smaller counterparts to difficult jobs that require significant computing power, complex reasoning and data analysis.
SLMs meet a different need, helping organisations to automate simpler jobs for a fraction of the price. OpenAI’s GPT-4 and Google’s Gemini Ultra are estimated to have cost around US$78m and US$191m, respectively, to test and refine, but SLMs can be up to 300 times cheaper to train.
Moreover, if hardware chips become cheap enough in the near future, SLMs’ ability to bypass the cloud could enable them to carry out specific tasks on the “edge”, directly on a user’s smartphone or laptop.
This could greatly reduce the risk of security breaches, tackling one of the main obstacles for generative AI adoption in sensitive sectors, which face tight regulations on handling and transferring data.
The shift to SLMs represents a significant development in the evolution of AI, placing the sophistication and convenience of AI-driven applications into our pockets.
Controlled fusion power could change the world
For the first time, researchers have created a fusion reaction that generates more energy than it consumes. Increasing investment may speed the path to market readiness and net zero.
Breakthrough Technology Dialogue: The future is now
From electrified aerospace to commercial fusion, technology is integrating with every industry. Discover the pivotal breakthroughs shaping the future of work, health and sustainability.
Breakthrough Technology Dialogue: A conscious creation
Breakthroughs across AI, biotech and quantum computing are emerging at blistering speed. How can society wield these tools to craft a better tomorrow?
Breakthrough Technology Dialogue 2024
Explore expert insights from Industry leaders and over 50 of the world's smartest minds