This feature allows engineering teams to reduce the risk of intellectual property infringement when using popular LLMs
TEL AVIV, Israel, Dec. 17, 2024 (GLOBE NEWSWIRE) -- Tabnine , the first-ever AI coding assistant, today introduced “Code Provenance and Attribution,” a feature that allows companies to benefit from using popular LLMs at scale for software development tasks, while minimizing the likelihood of restrictively licensed code being injected into their codebase.
Large language models (LLMs) from Anthropic, OpenAI, and others have
been trained on large catalogs of content and code from publicly visible sources, most of which are not openly licensed. Coupled with the likelihood that LLMs will generate content that matches previously covered material, using vendor-provided models is likely to result in intellectual property or copyright infringement. With Provenance and Attribution, Tabnine checks code generated using chat or AI agents against publicly visible code on GitHub, flags any matches, and provides the source repository and license type. This information makes it easier for engineering teams to review AI-generated code and decide whether the license for that code meets their specific standards and requirements.With its new Provenance and Attribution solution, Tabnine will facilitate the work of development teams as well as their legal and compliance teams who wish to take advantage of a wide variety of powerful models.
“Models trained on larger databases that don’t come from permissively licensed open source code can deliver superior performance, but companies that use them run the risk of running into intellectual property and copyright violations,” said Peter Guagenti, President of Tabnine. “Our Code Provenance and Attribution solution addresses this issue, increasing productivity without sacrificing compliance. Experienced engineering teams want to know the source and license of generative AI results, and this feature allows them to do just that.”
As copyright law for the use of AI-generated content is still uncertain, Tabnine’s proactive stance aims to significantly reduce the risk of intellectual property infringement when companies use models such as Anthropic’s Claude, OpenAI’s GPT-4o, and Cohere’s Command R+ for software development.
Tabnine’s license-friendly model, Tabnine Protected 2, which is trained exclusively on permissively licensed code, remains a safe bet. Many companies believe that even using an LLM trained on unlicensed software can be risky, so Tabnine will continue to promote and develop this unique model. The new Provenance and Attribution solution provides additional support for legal and compliance teams who are willing to use a wider variety of models as long as they do not specifically inject unlicensed code.
The Code Provenance and Attribution solution supports the full range of software development activities within Tabnine, including code generation, code remediation, test case generation, Jira issue implementation, and more. By reading code like a human, Tabnine not only flags results that are exact matches to open source code on GitHub, but also whether there are any functional or implementation matches.
Tabnine hopes to soon add features that allow users to identify specific repositories, such as those used by its competitors, and have Tabnine review the generated code against those repositories. Additionally, Tabnine plans to add a censorship option, allowing Tabnine administrators to remove the corresponding code before it is presented to the developer.
Code Provenance and Attribution is available in private preview and is accessible to any Tabnine enterprise customer, and works on all available models, including Anthropic, OpenAI, Cohere, Llama, Mistral, and Tabnine. To learn more about Code Provenance and Attribution, click here .
About Tabnine
Tabnine helps development teams of all sizes use AI to accelerate and improve the software development lifecycle. As the first AI coding assistant, Tabnine has been used by millions of developers worldwide to improve code quality and developer well-being through generative AI. Unlike other coding assistants, Tabnine is the AI that puts you in control. It’s highly customized for your engineering team, it’s private and secure (and runs smoothly in your controlled environments), it doesn’t store or train on your company’s code or user data, and it features models trained exclusively on open source code and permissive licenses to eliminate IP infringement risks. To learn more, visit tabnine.com or follow us on LinkedIn .
No comments:
Post a Comment