Converge Bio’s ‘Everything Store’ for LLM Programs in Biotechnology Brings $5.5M Seed

[ad_1]

AI is finding its way into every corner of biotech and pharmaceutical research, but like other industries, implementing it is not quite as easy as one would like. Converge Bio It has built a tool for companies to make biology-focused LLM programs actually work, from “enriching” their data to explaining their answers. The company has raised $5.5 million in a seed round to expand its product range.

“A model is just a model. That’s not enough,” said Dov Gertz, CEO and co-founder. “A pipeline must be created so companies can actually use the model in their R&D process. The market is highly fragmented, but pharmaceutical and biotechnology companies want to consume this technology in a standardized way and in one place. We want to be that place.”

If you’re not a machine learning engineer working in drug discovery, this may not be a familiar problem to you. But fundamentally, there are powerful foundational models, large linguistic models that are trained not on books and the Internet but on huge databases of DNA, protein structures, and genomics.

These are powerful and versatile templates, but like the LLMs used in products like ChatGPT and Cursor, they require a lot of work to get into a form that people can actually use day in and day out. This work is particularly difficult in specialized fields such as microbiology or immunology. Taking a “raw” LLM trained on billions of protein sequences and making it something that laboratory techs can use as part of their regular research is a non-trivial problem.

For example, Geertz proposed research on antibodies. There are LLMs trained in antibody biology, but they are very general. Converge Bio offers a series of improvements that can be done securely and using your company’s IP address.

From left: Ido Wiener, Chief Scientific Officer, Converge Bio; Dov Gertz, CEO; Oded Kalev, CTO. Image credits:Omar Hacohen/Converge Bio

The first is “data enrichment,” which augments the antibody LLM with important relevant data such as antibody-antigen interactions and protein-protein interactions. Then, loaded with more specific knowledge, they can be fine-tuned to the specific antigen the team is looking to target, for which they may have proprietary data.

“Now we have an application: the input is a sequence, and the output is a binding affinity,” Gertz said. The platform then provides another important layer: explainability. Researchers can drill down into the output to discover not only “this sequence works better than this one” but also locate the part of the sequence that appears to be present at the amino acid or base pair level. to prepare It works better.

Finally, it generates new sequences that provide improved results, also with the possibility of annotation. Geertz noted that interpretability has surprised them with its popularity among customers, which makes sense, because it allows experts to apply their domain expertise (e.g., protein interactions) to this newer, more obscure area of ​​bioinformatics and machine learning.

Image credits:Converge Bio

Converge uses many of the free and open source foundation models currently available, but is also creating its own. It already has a proprietary process for the explainable portion, Gertz said. The “curriculum” for data enrichment is entirely theirs as well – not a trivial process. He noted that training methodologies are one of the few closely guarded secrets of the most successful AI companies.

This is part of the moat they hope to build, along with the reality of it. As Gertz puts it: “This is probably the biggest biotech opportunity in five decades.”

However, many, perhaps most, biotech companies do not have a dedicated solution for doing LLM-related work in their field, and actively pursue areas where generic solutions do not apply.

“The idea is to be a repository of everything AI gene in biotech, and then use that as a wedge to deliver more over time,” Gertz said. “The behavior in pharma and bio is that once they have relationships with a vendor that they trust, they want to use it for other use cases, whether that’s antibody design or vaccine design. That’s why I think that’s the best situation at this moment in the market.” .

Investors seem to agree, putting in $5.5 million in a seed round led by TLV Partners.

The company will use the money to recruit and acquire customers, as startups often do at this stage, but it will also publish a scientific paper on designing antibodies (using its own systems, of course) and train a “suitable foundation model.” “

[ad_2]

Leave a Comment