Back to Blog
Featured Article

From Minutes to Tokens: How Operators Can Win in the Age of AI

Prof. Merouane Debbah
January 16, 2026
9 min read

For most of the past years, telecom operators had a simple story. You paid for minutes and messages, they carried your voice across oceans and your SMS across town. The business model was clear, the margins were healthy, and the network quietly did its job in the background.

Then the icons on the home screen changed everything. WhatsApp, Skype, Messenger, Telegram, Zoom, none of these companies built antennas or laid fiber. They simply rode on top of the networks operators had spent decades and billions building. Yet they ate the very services that once paid for those networks: voice calls, SMS, even long-distance traffic. Studies over the last decade all tell the same story: a sharp decline in voice, SMS and long-distance revenues, and a huge shift of communication to “over-the-top” apps that don’t pay operators directly for the service they provide.

Operators responded by selling data. We moved from “minutes and messages” to “2 GB, 10 GB, unlimited.” Data revenues grew, but mainly to compensate for what had been lost on traditional services. The investment curve, especially with 4G and 5G, kept going up, while average revenue per user stubbornly refused to follow. In many markets, operators began to feel more like electricity or water companies: essential utilities that carry everyone else’s value while capturing less and less of it themselves.

Now another wave is arriving with the same force as the smartphone revolution: generative AI. Once again, there is a risk that operators simply provide the connectivity while new players, cloud providers and AI platforms, capture most of the value. But this time, things can be different. AI gives operators a chance to turn their networks from passive pipes into active engines of intelligence, and in the process, to reinvent their business model.

At the heart of that opportunity are two ideas that sound technical but are actually very human: latency and tokens.

Latency: the invisible product

Most people don’t talk about “latency.” They only feel it. You feel it when a video call lags just enough that people talk over each other. You feel it when a cloud game stutters at the worst possible moment. You feel it when you ask an AI assistant a question and it pauses just long enough to break the illusion of a natural conversation. Latency is simply the delay between asking for something and getting a response. For years, operators have focused on their part of that delay: the communication link. They improved radio technologies, deployed fiber, and optimized protocols like WebRTC, the engine behind many real-time voice and video applications in the browser, to squeeze delays down to a fraction of a second.

In the age of AI, however, latency is no longer just about the network. Every time you speak to an AI model, two things happen. First, your voice or text must travel over the network. Second, the model must “think”, a computation called inference that typically runs on GPUs in a data center. Even if the radio link is incredibly fast, if your request travels thousands of kilometers to a busy cloud and back, the end-to-end experience will still feel slow.

This is where operators have a unique advantage. They already own a distributed infrastructure that sits physically close to users: base stations, local exchanges, regional data centers, central offices. These are ideal locations to host edge computing, small clusters of servers equipped with GPUs or AI accelerators, just a few milliseconds away from the device. Industry analyses show that many new 5G and edge computing use cases, from industrial control to immersive media, require latencies below about 10 milliseconds to feel truly “real time”. If operators can bring AI inference into their networks and combine it with optimized real-time paths, the same technology that made video calls fluid, they can start to control both sides of the latency equation: the link and the compute. Instead of simply carrying traffic to distant clouds, the network itself becomes a place where AI runs.

When the network starts to think

This shift is not happening in isolation. Around the world, the mobile industry is working on ways to make networks programmable and easier for developers to use. The GSMA’s Open Gateway initiative, for example, defines a common set of network APIs so that applications can request capabilities like device location, communication quality, fraud checks or age verification in a standardized way, independent of the operator. More than 70 operator groups, representing roughly 80% of mobile connections globally, have already committed to this framework. Combine this with edge computing and a pattern appears. An application could request a low-latency path for audio, video or sensor data, and at the same time ask the network to run an AI model nearby to analyze that data in real time. That model might detect anomalies in a factory, analyze a live video feed for safety risks, optimize traffic flows in a city, or power an interactive assistant in a retail store. Recent examples from private 5G deployments in manufacturing already show this trend: operators and partners are using edge AI to process data on the factory floor, enabling real-time analytics, automation and predictive maintenance that would be impossible with a distant cloud alone.The network, in other words, starts to think. And as soon as it thinks, it can be priced very differently.

From gigabytes to “Intelligence Bundles”

For decades, the commercial language of telecom was minutes and messages. Then it became gigabytes. In the AI era, a new unit quietly becomes central: the token. In simple terms, a token is a small chunk of text or audio that an AI model reads or writes. When you interact with a large language model, your usage is often measured in tokens, not in minutes or megabytes. Each token processed by a model consumes a small amount of compute, energy and time.

This makes tokens very interesting from a business point of view. They are directly linked to the cost of intelligence. Operators are already experts at metering usage, they know how to count seconds of talk time and megabytes of data. Extending this to tokens is a natural step if they run AI inference in their networks. Imagine a consumer offer that doesn’t just say “30 GB of data per month”, but “30 GB of data and two million AI tokens per month.” The data allows you to browse, stream and chat as usual. The tokens allow you to talk to AI assistants, summarize documents, translate video calls in real time, generate images or get help with coding, all without having to think about a separate AI bill.

For companies the same idea can scale up. A factory might pay for a private 5G network, guaranteed bandwidth, and a monthly allowance of AI tokens used by machine-vision systems, anomaly detection, predictive maintenance and digital twins, all running at the edge of the operator’s network. When the allowance is exceeded, they can buy more tokens, just as they once bought more SMS or roaming packages. Voice, data and intelligence become part of a single bundle. The operator no longer sells just connectivity, but a combined service.

Tokens as the new oil

In the broader AI debate, some people now say that compute is the new oil. The idea, associated with Sam Altman and others, is that the real bottleneck for AI progress is no longer algorithms or data but the availability of computational power: data centers, GPUs, energy. Whoever controls large-scale compute has enormous influence over the future digital economy.

For operators, this metaphor can be taken quite literally. They already operate critical infrastructure at scale, in highly regulated environments, with strong capabilities in billing, metering and quality of service. If they extend that infrastructure from pure connectivity to AI compute at the edge, tokens become their new oil.

Each token processed in their network is a tiny unit of value: a short burst of intelligence executed close to the user. Tokens can be priced differently depending on the required latency, reliability, privacy guarantees or geographic location. They can be sold directly to consumers and enterprises, or wholesale to app providers and cloud platforms. Just as operators once sold capacity for long-distance calls or SMS termination, they could sell AI tokens to third-party services that want guaranteed performance in certain regions or for certain applications.

In this model, operators are not trying to become yet another AI app. They are becoming the platform on which many AI apps run, in the same way cloud providers became the underlying platform for the web. The difference is that operators can offer something clouds alone cannot easily replicate: ultra-low latency, local regulation, and deep integration with the connectivity layer.

A choice, once again

None of this is guaranteed. It requires investment in edge computing, experimentation with new commercial models, and a cultural shift from being a “pipe” to being a “platform.” It also requires collaboration: with cloud providers, with AI companies, with industrial partners, and with regulators who will want to understand how these new services fit into existing rules.

If operators do nothing, AI workloads will simply run in distant hyperscale data centers, and history will repeat itself. The network will be essential but invisible, valuable but under-rewarded, while most of the profits and innovation sit at the application layer.

If, however, operators embrace edge inference, expose their networks through open APIs, and start selling bundles that combine voice, data and tokens, the story can be different. The infrastructures they built for the voice era and upgraded for the smartphone era can become the foundations of an intelligence era, where every cell site and local exchange is not just a relay, but a small brain.

We began with minutes and messages. We migrated to gigabytes. In the age of AI, the operators who thrive will be those who learn to trade in a new unit: tokens, small, invisible packets of intelligence flowing through a network that finally does more than carry bits. It thinks.

P

Prof. Merouane Debbah

Researcher, Educator and Technology Entrepreneur.

Tags

#AI#Telecommunications#Edge Computing