There’s a lot of discussion about how this can work, and even some pioneering work, but no firm conclusions.
Developers are spending increasing amounts of time and effort in creating machine-learning (ML) models for use in a wide variety of applications. While this will continue as the market matures, at some point some of these efforts might be seen as reinventing models over and over.
Will developers of successful models ever have a marketplace in which they can sell those models as IP to other developers? Are there developers that would use the models? And are there any models mature enough to go up for sale?
“From an end-customer perspective, there are definitely a lot of companies, especially in the IoT domain, that do not have enough time and energy and resources to spend time training networks, because training networks is a heavy exercise,” said Suhas Mitra, product marketing director for Tensilica AI products at Cadence.
As to potential areas that might be ready to go, “There are definitely applications in image and natural-language understanding where pre-trained models can work well out of the box,” said Sree Harsha Angara, product marketing manager for IoT, compute, and security at Infineon.
Many industry observers can envision a future marketplace for AI models. Some say that, informally, they’re already in place. “I’m fairly certain such concepts exist already,” said Angara. “It may or may not be like a marketplace, but people absolutely do sell models today.”
Plans for a more formal market are already underway, involving not only fully-trained models, but pieces of models that can be assembled — or ensembled. But it makes an enormous difference whether the models are hosted in the cloud or at the edge in a device. In the latter case, license restrictions may get in the way of a meaningful market.
Eliminating duplication
Early days of a new technology tend to be marked by a significant amount of duplicative development. With silicon circuits we went for decades with an attitude of, “We do all of our own circuits.” At some point, however, people started noticing that they kept redoing circuits that weren’t fundamental to the value of the chip.
For instance, if a chip needed a PCIe port simply to get data in and out, it would have meant studying the massive PCI spec to figure out how to do it. This would be a significant burden and time sink for a chip feature that had a supporting role, not a starring role.
It took a while for the industry to catch on, but today there’s a robust market in circuit IP. There’s a general expectation that if it’s not a differentiating circuit, and if good IP is available for a reasonable price, then you use the IP rather than wasting internal resources on that circuit.
We’re now in the early days of machine learning (ML), and everyone is inventing their own thing. But certain areas, like vision, may start to even out as new entrants prove unable to demonstrate significant benefits over those that came before.
And that reinventing isn’t cheap. “If you are starting from scratch, you might spend months or a year and tens or hundreds of thousands of dollars,” noted Ashutosh Pandey, lead principal ML/audio systems engineer at Infineon. “Whereas with a pre-trained model, somebody has done this for you,”
Are we at a point where an IP market for trained AI models might start to make sense? At least one company thinks so. But there are a number of opinions out there as to whether this can happen and what a market should look like.
What has the most value in the AI chain?
If you ask people today what the most valuable thing is for the best AI development, you typically get one word: data. “At the edge, I’d say training data is even more important than the model itself,” said Angara.
Training data is the big un-equalizer, because a few enormous companies have outsized troves of data, while others make do with what feel like tiny amounts by comparison. That said, open-source data is available from a number of organizations. The dominance of big tech when it comes to data is, to some extent, being answered by a willingness to share data by others.
The other question is whether AI models are, by definition, differentiating. These days they seem to be. Applications like classification, while common, haven’t yet settled into the status of “fully solved problem.” Companies are still differentiating themselves by the quality of their classification.
It’s not clear how long that will last. Applications like autonomous driving need classification, but it’s what’s done with the classes that will become more important. One can imagine a not-too-distant time when the attitude is, “Yeah, anyone can do classification, but not everyone can do what I’m doing with the results.”
In that case, it doesn’t make a lot of sense to reinvent a classification model over and over. This is, historically, a situation where an IP market can get a footing.
What would a market need?
The obvious deliverable from a market would be the model itself. Even better would be the model and the data that trained it, although that could prove to be a big ask in many cases. But it can be useful in edge applications for optimization. “Training data is important if you want to create something that’s best in class, lowest power,” said Infineon’s Pandey.
Different companies might be good at different things, making it beneficial to be able to pick and choose best-in-class models. “Let’s say the user wants to do speech translation,” said Archil Cheishvili, founder and CEO of GenesisAI. “For that, you need speech recognition and translation. Google might have amazing speech recognition, but its translation might be pretty bad, and Facebook’s might be worse.”
So prior to the deliverables, every market needs a way of evaluating the quality of the product. Evaluation might first mean looking at published benchmarks, providing those benchmarks correlate well with a wide variety of applications. If not, or if trust in the benchmarks isn’t yet well established, potential purchasers probably will want some way of demonstrating that a candidate model will work for them.
Another requirement is the need to be able to modify the models for a specific application. Transfer learning makes possible the modification of a pre-trained model with new data. But some transparency on the structure of the model is necessary so the purchaser knows which layers to freeze and which to retrain. Would such transparency reveal competitive secrets in a model that worked particularly well?
There is also some precedent for this in circuit IP, although it’s a little different. For something like a PCIe transceiver circuit, there are many possible customizations and personalizations that may be needed for a given instance. Such models typically are highly parameterized so that customer can tweak the circuit to do exactly what they need.
With an AI model, it’s not necessarily a matter of parameterization (understanding that “parameter” here is different from the other AI usage of the word, which refers to the model weights). But it may be possible, for instance, to create a way of exposing abstracted internal information, like some way of identifying layers to freeze when retraining. Tools would need a way to respect that setting when retraining a model.
Purchasers also might want some test data for use in evaluating the model. Typically, a portion of the training data is set aside for testing because that’s how accuracy is determined. One could suggest that the IP seller might need to make testing data available.
A possible concern with that approach would be that the seller might have cherry-picked that data, so an alternative would be the expectation that a purchaser comes with their own test data. That might give a better-trusted result. It also would relieve the seller of the need to provide that data.
The elephant in the room: free stuff
There is one enormous potential barrier to a thriving model marketplace — the fact that so many models are available for free. So why would anyone pay?
“Models are given away for free many times,” said Ron Lowman, strategic marketing manager, IP at Synopsys. “Their specific proprietary piece is the data that trains the model. Some models are considered proprietary, but have little to no market acceptance today because free models thus far have been what customers are using.”
This places some conditions on why a marketplace might have value. If the models being sold are pre-trained and prêt-à-porter, the buyer has no need to acquire training data. If such a model is free, there’s no incentive to buy one.
If a model isn’t pre-trained, the buyer effectively is purchasing a service for the seller to train a model for them. This could have value for smaller companies without the resources to do their own training, even if they have their own data.
“Training is so critical to get the accuracy you want that I think you’ll get service companies that say, ‘We’ll help you with the training. Give us your datasets,’” said Dana McCarty, vice president of sales and marketing for inference products at Flex Logix.
“It could be a lot like most software now,” said Sam Fuller, senior director of marketing at Flex Logix. “So much of it is free. But if you really want to make a solution, you’ve got to pay somebody to figure it out and put it together, because it’s really hard.”
If a paid model needed some level of retraining, then it’s competing with all of the other free models that need retraining and, again, it becomes more of a service sale than a sale of the model itself.
Fuller cited one company that had found some success, but its AI business involves modification services rather than a set model. “They got started by counting sheep in a field,” he said. “And they realized that nobody really had a system to do this. So they built the system and adapted it. If you want to count cows or cars or motorcycles, you click, ‘I need a model that does these things.’ And then they would presumably build that model for you.”
Hugging Tree also has a market that relies on wrap-around services for its revenue stream.
This issue of competing with freeware is one that splits opinions on whether a market could work or not. “I don’t think people are going to pay for models because there are so many models out there that are free,” said McCarty.
A cloud-based marketplace
Such a marketplace in the cloud would operate fundamentally differently from one targeting edge devices, and in general it’s simpler for buyers and sellers. “To be a model provider or a solution provider for the cloud or for enterprise is much easier,” said Nick Ni, director of product marketing, AI and software at Xilinx. “You don’t have to worry about any of the complexity of edge-based design. You just have to conform to the hosting specs.”
Hyperconnectivity in the cloud means that, rather than outright buying a model and instantiating it on one’s own website, a single instantiation by the seller can serve the buyer through an API. Such an API-based approach is typical even today for cloud-based apps, and latency in the millisecond range is workable for such applications.
The buyer’s application would send an inference query to the seller’s model without the buyer ever taking formal ownership of the model. Business models could vary, but they likely would be some form of subscription, including the possibility of charging according to the number of API requests per time period.
Exactly how a buyer would evaluate models in order to select one could be as simple as allowing some number of trial inferences to get a sense of accuracy. Arrangements might be necessary to allow enough samples for a statistically meaningful result. The results of such an evaluation would be most meaningful when the candidate model was to be used as is, with no retraining.
Retraining would likely be done by the seller using the buyer’s data. This would make evaluation a little more difficult, since, by definition, the candidate model wouldn’t be good at the final application prior to retraining.
In addition, sellers could make their models available as teacher models, allowing other models to be trained as students by querying the teacher. This is also a series of inference queries.
“Instead of me needing to have data to be able to teach this model how to recognize colors, frames, and so on, I might work with a student model, which might be really good just at recognizing frames or colors,” said Cheishvili.
Multiple such student models could work together, providing not full on best-in-class models, but best-in-class components of the model. “If you want to build the best image recognition tool, look for solutions that are really good at recognizing colors, recognizing frames — and then try to combine them,” Cheishvili continued. “You produce the best image recognition a lot cheaper.”
Multiple models could be chained together through a series of API calls, with the results of one model being used in the API call for a subsequent model.
A cloud-based example
One early attempt at a cloud marketplace by GenesisAI implements a standard API for inference requests. But, in their case, these aren’t necessarily pre-trained models. In a process that amounts to something of a gamble, buyers post training data, a minimum success metric, and a reward. Multiple sellers can then compete by using the buyer’s data to train their model. Critically, each competing vendor must post a stake.
“We want to create an incentive structure for people to exchange data and trade services,” said GenesisAI’s Cheishvili. “They do this in a way such that they don’t give up anything crazily proprietary.”
Ideally, one of them has the best result, and that winner collects all of the stakes posted for the competition. In the event that no seller meets the minimum level of success, there is no winner, and the buyer collects the stakes.
This raises the question of how eager sellers would be to compete, since losing could be worse than simply not winning the sale: It would mean losing the stake. Too many losses would become burdensome.
That said, it’s important to separate the functioning of the marketplace once purchases have been made from the process of winning a deal. It could be that the means of competing for deals changes without affecting the functioning of models already in operation.
After a deal is struck, there’s the ongoing business model. “The API suppliers will be able to monetize their APIs as soon as users start to subscribe and to process requests,” explained Cheishvili. “The subscription mechanism can be integrated in a number of ways — either tied to the number of requests or just monthly or yearly subscriptions with unlimited requests.”
One risk of not being able to take ownership of a model is the prospect of the owner going out of business or, for some other reason, removing the model from the marketplace.
“If an API provider goes out of business, there are two options,” said Cheishvili. “The API subscribers have to find an alternative API on our marketplace that does the same job (and there are rarely APIs that are totally unique, where no other API performs the same tasks). Alternatively, I would not rule out the possibility of the departing API provider giving the option to buy the source code.”
Models for the edge
A marketplace for models that would be embedded in edge devices would need to work very differently. It would not be possible to require a connection to a cloud-based model, querying via API. The latency would be unacceptable — and the device would be unable to function without a connection.
Here, it’s likely that a buyer would need to take actual ownership of the model. A chain of models would need to work not by chaining API calls, but rather by models placing outputs in locations known to the system, from which they can be used as inputs for a next model.
Fig. 1: On the left, an edge device will need models resident within the device. Outright ownership is more likely in that scenario. On the right, cloud models may interact via API, with no transfer of ownership on the models. Source: Bryon Moyer/Semiconductor Engineering
In many regards, such a marketplace could function in a manner very similar to circuit IP. Digital rights management (DRM) capabilities could protect the IP in advance of the purchase, allowing evaluation without giving away the store. Xilinx’s app store uses this approach.
But there’s one major catch for such a market, which is the fact that a model is being sold for money, with ownership being transferred. That may violate license restrictions on any model derived from an open-source network.
Most familiar networks — like ResNet and YOLO and Inception — are made available for free, but under restrictions that may vary according to the specific license. In general, those licenses forbid the outright selling of the model or its derivatives. In fact, some licenses require that any derived versions be made available via open source under similar licensing.
“If you take open-source networks and modify them, you cannot ask for money for that thing,” said Mitra. “Technically it is derivative and must be considered as open source.”
Because so many of the models on the internet these days are derived from some of these fundamental models, it would not be possible to train them and then sell them outright without finding some clever loophole in the restrictions. It’s possible to sell a system making use of the model, but not the model in isolation.
We’re talking about very popular networks here. “When we talk about YOLO, ResNet, and stuff like that, we’re generally dealing with models that are trained on the standard image databases.” Even the training data is open-source.
In reality, no model is useful by itself. It is usually part of an overall pipeline in a larger application. And that application can be sold. “People will never take a model directly to production, because you have to add your logic to build up your pipeline to be useful,” said Ni.
This is where the services angle for training or any other development work becomes important. Companies like Red Hat became successful not by monetizing specific instances of Linux, which the licensing terms didn’t allow, but by providing development and management services that companies found valuable.
“They’re trying to sell IP, but with a full expectation that if they get a lead customer, they’ll probably sign an NRE contract on the design services and really build up the whole customization,” added Ni.
So there is precedent for making money around the proliferation of open-source models, but not from the models themselves.
In the specific example of the offerings on Xilinx’s app store, any pure models originate as custom models, not from open source. Any open-source models come as a larger package, within a more complete application that includes the model and other circuitry around it. In that case, it’s not the bare model being monetized. It’s the complete application.
Conclusion
Asking technologists about the viability of selling models makes it clear that this isn’t yet a well-accepted thing. Some folks will say such markets already exist. Others will say they don’t exist, but ultimately will. And still others doubt they can ever exist alongside free models.
Some simply aren’t seeing the signs of a future market. “I don’t see anybody saying, ‘Hey, I’ve got an even better vision model that’s proprietary that I’ll license to you,’” said Fuller.
That suggests that such markets are more likely in the future. That shouldn’t be too surprising in these early days of machine learning, because in many cases we haven’t yet reached the point of constantly reinventing wheels. But people are thinking about this now, and in some cases, putting markets together. How they work is likely to evolve as best practices become evident, though it’s likely to be a few years before we’re at that point.
Leave a Reply