[ad_1]
If you’ve used ChatGPT or Perplexity search, you know that the ability to search the web and get citations included dramatically improves AI-powered chatbots. Results are best when they include timely information, which is what a web search does Reducing so-called hallucinations (i.e. when production AI outputs incorrect information).
That’s why the French startup Link It builds an API that allows developers to access web content from distinct, trusted sources and delivers the results to a large language model (LLM) to enrich its answers. Many AI developers call this workflow augmented retrieval generation (or RAG).
More importantly, the future of robot shoveling is uncertain. If there is no pre-existing financial agreement between content publishers and the entities scraping the web pages, these bots are lifting content from the open web without payment and many people are unhappy with this deal – increasing regulatory scrutiny around AI training.
There are now also high-profile legal issues in the works, such as the ongoing lawsuit between OpenAI, the maker of ChatGPT, and the New York Times – so the situation around web scraping may change in the near future. That’s why OpenAI has signed multi-year content licensing agreements with major publishers like AP, Axel Springer, Condé Nast, El País, Financial Times, Le Monde, and others.
“We started the company at a time when OpenAI was making deals with news sources…for training or inference purposes, to augment answers from OpenAI’s models and products. Well, this is something,” Philip Mizrahi, co-founder and CEO of Linkup, told TechCrunch. Great because we finally have AI companies that pay for their resources,” he said, explaining what prompted the founders to create a company to connect AI developers with content providers for — hopefully — their mutual benefit.
Right now, content publishers face a difficult decision about what to do about GenAI’s thirst for data. They can block web scrapers using the (non-legally binding) robots.txt metadata file (which indicates whether a website can be used to train an AI model). Furthermore, they can sue AI companies that they believe have infringed their copyrights. Alternatively, they can allow bots to index their content freely (i.e., YOLO?). Or they may be able to license content to AI developers to get some rewards for their intellectual property.
But there are thousands of AI companies (or technology companies using AI) that don’t have the scale and reach of OpenAI. Meanwhile, the great thing about the web is that there is a long line of content publishers. But this means that a small content publisher usually does not have the financial resources to file a lawsuit. This also means that it will be difficult to switch from an abstraction model to a licensing model for millions of websites.
That’s why Linkup is not just a technical solution. It’s a market. An intermediary between content publishers and companies that want to augment their LLM answers with web content.
Linkup signs content licensing deals with publishers and integrates with their content management system (CMS) so it can fetch content from publishers without any extraction. Linkup then pays content partners based on the number of times their content is accessed by Linkup customers.
“We’re really targeting applications that apply AI to their own products,” Mizrahi said. “So, a typical use case is I’m building an AI application using a model from Mistral or OpenAI. I’m building my own pipeline, but I need to enrich that pipeline with external information.
As a side note, while ChatGPT can browse the web, GPT models cannot. OpenAI provides a widely popular implementation (ChatGPT) and LLMs that developers can use with the GPT API. But web searching is one of the features of ChatGPT.
“An example I like is one of our clients… built an internal app for their sales people,” Mizrahi also told us. “On the one hand, they have included all the advantages of their own products. Thanks to us, they receive fresh and high-quality information about their potential opportunities and put it into the Mistral LLM program. Mistral’s LLM program will create a kind of sales presentation for sales representatives, which they will have in front of them when they make calls with customers.” Potential.
Initially, Linkup decided to focus on corporate and business information. In addition to news sites, the startup works with knowledge databases – think Statista, Xerfi, or other resources in the same vein.
It’s not the only startup working to bring premium content to LLMs through behind-the-scenes licensing contracts. The most obvious competitor is ScalePoststartup He works with confusion To speed up licensing deals with publishers.
Linkup raised a €3 million seed round ($3.2 million at current exchange rates) a few months ago from Axeleo Capital, Motier Ventures, Seedcamp, and a hundred angel investors. There are about 10 people working at the startup right now, and it plans to hire 10 more over the next year.
[ad_2]