2.2 C
United States of America
Wednesday, March 5, 2025

Contextual AI’s new AI mannequin crushes GPT-4o in accuracy — right here’s why it issues


Be part of our every day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Be taught Extra


Contextual AI unveiled its grounded language mannequin (GLM) at present, claiming it delivers the best factual accuracy within the {industry} by outperforming main AI methods from Google, Anthropic and OpenAI on a key benchmark for truthfulness.

The startup, based by the pioneers of retrieval-augmented technology (RAG) know-how, reported that its GLM achieved an 88% factuality rating on the FACTS benchmark, in comparison with 84.6% for Google’s Gemini 2.0 Flash, 79.4% for Anthropic’s Claude 3.5 Sonnet and 78.8% for OpenAI’s GPT-4o.

Whereas massive language fashions have reworked enterprise software program, factual inaccuracies — typically referred to as hallucinations — stay a vital problem for enterprise adoption. Contextual AI goals to resolve this by making a mannequin particularly optimized for enterprise RAG purposes the place accuracy is paramount.

“We knew that a part of the answer could be a way referred to as RAG — retrieval-augmented technology,” mentioned Douwe Kiela, CEO and cofounder of Contextual AI, in an unique interview with VentureBeat. “And we knew that as a result of RAG is initially my concept. What this firm is about is de facto about doing RAG the correct means, to form of the following degree of doing RAG.”

The corporate’s focus differs considerably from general-purpose fashions like ChatGPT or Claude, that are designed to deal with every little thing from artistic writing to technical documentation. Contextual AI as an alternative targets high-stakes enterprise environments the place factual precision outweighs artistic flexibility.

“When you’ve got a RAG drawback and also you’re in an enterprise setting in a extremely regulated {industry}, you don’t have any tolerance in any way for hallucination,” defined Kiela. “The identical general-purpose language mannequin that’s helpful for the advertising division will not be what you need in an enterprise setting the place you’re way more delicate to errors.”

A benchmark comparability exhibiting Contextual AI’s new grounded language mannequin (GLM) outperforming opponents from Google, Anthropic and OpenAI on factual accuracy checks. The corporate claims its specialised strategy reduces AI hallucinations in enterprise settings.(Credit score: Contextual AI)

How Contextual AI makes ‘groundedness’ the brand new gold commonplace for enterprise language fashions

The idea of “groundedness” — guaranteeing AI responses stick strictly to info explicitly supplied within the context — has emerged as a vital requirement for enterprise AI methods. In regulated industries like finance, healthcare and telecommunications, firms want AI that both delivers correct info or explicitly acknowledges when it doesn’t know one thing.

Kiela supplied an instance of how this strict groundedness works: “In case you give a recipe or a system to an ordinary language mannequin, and someplace in it, you say, ‘however that is solely true for many circumstances,’ most language fashions are nonetheless simply going to provide the recipe assuming it’s true. However our language mannequin says, ‘Truly, it solely says that that is true for many circumstances.’ It’s capturing this extra little bit of nuance.”

The flexibility to say “I don’t know” is an important one for enterprise settings. “Which is mostly a very highly effective function, if you consider it in an enterprise setting,” Kiela added.

Contextual AI’s RAG 2.0: A extra built-in strategy to course of firm info

Contextual AI’s platform is constructed on what it calls “RAG 2.0,” an strategy that strikes past merely connecting off-the-shelf elements.

“A typical RAG system makes use of a frozen off-the-shelf mannequin for embeddings, a vector database for retrieval, and a black-box language mannequin for technology, stitched collectively by way of prompting or an orchestration framework,” in keeping with an organization assertion. “This results in a ‘Frankenstein’s monster’ of generative AI: the person elements technically work, however the entire is much from optimum.”

As a substitute, Contextual AI collectively optimizes all elements of the system. “We’ve this mixture-of-retrievers element, which is mostly a strategy to do clever retrieval,” Kiela defined. “It appears to be like on the query, after which it thinks, basically, like many of the newest technology of fashions, it thinks, [and] first it plans a technique for doing a retrieval.”

This whole system works in coordination with what Kiela calls “the very best re-ranker on the planet,” which helps prioritize essentially the most related info earlier than sending it to the grounded language mannequin.

Past plain textual content: Contextual AI now reads charts and connects to databases

Whereas the newly introduced GLM focuses on textual content technology, Contextual AI’s platform has not too long ago added help for multimodal content material together with charts, diagrams and structured information from widespread platforms like BigQuery, Snowflake, Redshift and Postgres.

“Essentially the most difficult issues in enterprises are on the intersection of unstructured and structured information,” Kiela famous. “What I’m principally enthusiastic about is de facto this intersection of structured and unstructured information. Many of the actually thrilling issues in massive enterprises are smack bang on the intersection of structured and unstructured, the place you will have some database data, some transactions, perhaps some coverage paperwork, perhaps a bunch of different issues.”

The platform already helps quite a lot of advanced visualizations, together with circuit diagrams within the semiconductor {industry}, in keeping with Kiela.

Contextual AI’s future plans: Creating extra dependable instruments for on a regular basis enterprise

Contextual AI plans to launch its specialised re-ranker element shortly after the GLM launch, adopted by expanded document-understanding capabilities. The corporate additionally has experimental options for extra agentic capabilities in improvement.

Based in 2023 by Kiela and Amanpreet Singh, who beforehand labored at Meta’s Elementary AI Analysis (FAIR) crew and Hugging Face, Contextual AI has secured prospects together with HSBC, Qualcomm and the Economist. The corporate positions itself as serving to enterprises lastly notice concrete returns on their AI investments.

“That is actually a chance for firms who’re perhaps underneath strain to begin delivering ROI from AI to begin taking a look at extra specialised options that really resolve their issues,” Kiela mentioned. “And a part of that actually is having a grounded language mannequin that’s perhaps a bit extra boring than an ordinary language mannequin, but it surely’s actually good at ensuring that it’s grounded within the context and which you can actually belief it to do its job.”


Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles