Be a part of our day by day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Be taught Extra
Qodo, an AI-driven code high quality platform previously generally known as Codium, has introduced the discharge of Qodo-Embed-1-1.5B, a brand new open-source code embedding mannequin that delivers state-of-the-art efficiency whereas being considerably smaller and extra environment friendly than competing options.
Designed to reinforce code search, retrieval and understanding, the 1.5-billion-parameter mannequin achieves top-tier outcomes on {industry} benchmarks, outperforming bigger fashions from OpenAI and Salesforce.
For enterprise growth groups managing huge and complicated codebases, Qodo’s innovation represents a leap ahead in AI-driven software program engineering workflows. By enabling extra correct and environment friendly code retrieval, Qodo-Embed-1-1.5B addresses a vital problem in AI-assisted growth: context consciousness in large-scale software program methods.
Why code embedding fashions matter for enterprise AI
AI-powered coding options have historically centered on code technology, with giant language fashions (LLMs) gaining consideration for his or her means to jot down new code.
Nonetheless, as Itamar Friedman, CEO and cofounder of Qodo, defined in a video name interview earlier this week: “Enterprise software program can have tens of thousands and thousands, if not lots of of thousands and thousands, of strains of code. Code technology alone isn’t sufficient — you’ll want to make sure the code is high-quality, works accurately and integrates with the remainder of the system.”
Code embedding fashions play a vital function in AI-assisted growth by permitting methods to look and retrieve related code snippets effectively. That is significantly necessary for big organizations the place software program initiatives span thousands and thousands of strains of code throughout a number of groups, repositories and programming languages.
“Context is king for something proper now associated to constructing software program with fashions,” Friedman stated. “Particularly, for fetching the proper context from a very giant codebase, you must undergo some search mechanism.”
Qodo-Embed-1-1.5B gives efficiency and effectivity
Qodo-Embed-1-1.5B stands out for its stability of effectivity and accuracy. Whereas many state-of-the-art fashions depend on billions of parameters — OpenAI’s text-embedding-3-large has 7 billion, as an example — Qodo’s mannequin achieves superior outcomes with simply 1.5 billion parameters.
On the Code Info Retrieval Benchmark (CoIR), an industry-standard take a look at for code retrieval throughout a number of languages and duties, Qodo-Embed-1-1.5B scored 70.06, outperforming Salesforce’s SFR-Embedding-2_R (67.41) and OpenAI’s text-embedding-3-large (65.17).

This stage of efficiency is vital for enterprises in search of cost-effective AI options. With the flexibility to run on low-cost GPUs, the mannequin makes superior code retrieval accessible to a wider vary of growth groups, lowering infrastructure prices whereas enhancing software program high quality and productiveness.
Addressing the complexity, nuance and specificity of various code snippets
One of many greatest challenges in AI-powered software program growth is that similar-looking code can have vastly totally different features. Friedman illustrates this with a easy however impactful instance:
“One of many greatest challenges in embedding code is that two practically equivalent features — like ‘withdraw’ and ‘deposit’ — could differ solely by a plus or minus signal. They have to be shut in vector house but in addition clearly distinct.”
A key concern in embedding fashions is guaranteeing that functionally distinct code isn’t incorrectly grouped collectively, which may trigger main software program errors. “You want an embedding mannequin that understands code properly sufficient to fetch the proper context with out bringing in related however incorrect features, which may trigger severe points.”
To unravel this, Qodo developed a novel coaching method, combining high-quality artificial information with real-world code samples. The mannequin was skilled to acknowledge nuanced variations in functionally related code, guaranteeing that when a developer searches for related code, the system retrieves the proper outcomes — not simply similar-looking ones.
Friedman notes that this coaching course of was refined in collaboration with Nvidia and AWS, each of that are writing technical blogs about Qodo’s methodology. “We collected a novel dataset that simulates the fragile properties of software program growth and fine-tuned a mannequin to acknowledge these nuances. That’s why our mannequin outperforms generic embedding fashions for code.”
Multi-programming language assist and plans for future growth
The Qodo-Embed-1-1.5B mannequin has been optimized for the ten mostly used programming languages, together with Python, JavaScript and Java, with extra assist for a protracted tail of different languages and frameworks.
Future iterations of the mannequin will broaden on this basis, providing deeper integration with enterprise growth instruments and extra language assist.
“Many embedding fashions wrestle to distinguish between programming languages, typically mixing up snippets from totally different languages,” Friedman stated. “We’ve particularly skilled our mannequin to forestall that, specializing in the highest 10 languages utilized in enterprise growth.”
Enterprise deployment choices and availability
Qodo is making its new mannequin extensively accessible by means of a number of channels.
The 1.5B-parameter model is on the market on Hugging Face underneath the OpenRAIL++-M license, permitting builders to combine it into their workflows freely. Enterprises needing extra capabilities can entry bigger variations underneath business licensing.
For firms in search of a totally managed answer, Qodo provides an enterprise-grade platform that automates embedding updates as codebases evolve. This addresses a key problem in AI-driven growth: guaranteeing that search and retrieval fashions stay correct as code modifications over time.
Friedman sees this as a pure step in Qodo’s mission. “We’re releasing Qodo Embed One as step one. Our objective is to repeatedly enhance throughout three dimensions: accuracy, assist for extra languages, and higher dealing with of particular frameworks and libraries.”
Past Hugging Face, the mannequin will even be out there by means of Nvidia’s NIM platform and AWS SageMaker JumpStart, making it even simpler for enterprises to deploy and combine it into their current growth environments.
The way forward for AI in enterprise software program dev
AI-powered coding instruments are quickly evolving, however the focus is shifting past code technology towards code understanding, retrieval and high quality assurance. As enterprises transfer to combine AI deeper into their software program engineering processes, instruments like Qodo-Embed-1-1.5B will play a vital function in making AI methods extra dependable, environment friendly and cost-effective.
“In the event you’re a developer in a Fortune 15,000 firm, you don’t simply use Copilot or Cursor. You may have workflows and inner initiatives that require deep understanding of huge codebases. That’s the place a high-quality code embedding mannequin turns into important,” Friedman stated.
Qodo’s newest mannequin is a step towards a future the place AI isn’t simply helping builders with writing code — it’s serving to them perceive, handle and optimize it throughout advanced, large-scale software program ecosystems.
For enterprise groups seeking to leverage AI for extra clever code search, retrieval and high quality management, Qodo’s new embedding mannequin provides a compelling, high-performance various to bigger, extra resource-intensive options.