What open-source AI models should your enterprise use? Endor Labs analyzes them all


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


AI development is akin to the early wild west days of open source — models are being built on top of each other, cobbled together with different elements from different places. 

And, much like with open-source software, this presents problems when it comes to visibility and security: How can developers know that the foundational elements of pre-built models are trustworthy, secure and reliable?

To provide more of a nuts-and-bolts picture of AI models, software supply chain security company Endor Labs is today releasing Endor Labs Scores for AI Models. The new platform scores the more than 900,000 open-source AI models currently available on Hugging Face, one of the world’s most popular AI hubs. 

“Definitely we’re at the beginning, the early stages,” George Apostolopoulos, founding engineer at Endor Labs, told VentureBeat. “There’s a huge challenge when it comes to the black box of models; it’s risky to download binary code from the internet.”

Scoring on four critical factors

Endor Labs’ new platform uses 50 out-of-the-box metrics that score models on Hugging Face based on security, activity, quality and popularity. Developers don’t have to have intimate knowledge of specific models — they can prompt the platform with questions such as “What models can classify sentiments?” “What are Meta’s most popular models?” or “What is a popular voice model?”

Courtesy Endor Labs.

The platform then tells developers how popular and secure models are and how recently they were created and updated. 

Apostolopoulos called security in AI models “complex and interesting.” There are numerous vulnerabilities and risks, and models are susceptible to malicious code injection, typosquatting and compromised user credentials anywhere along the line. 

“It’s only a matter of time as these things become more widespread, we will see attackers all over the place,” said Apostolopoulos. “There are so many attack vectors, it’s difficult to gain confidence. It’s important to have visibility.”

Endor —which specializes in securing open-source dependencies — developed the four scoring categories based on Hugging Face data and literature on known attacks. The company has deployed LLMs that parse, organize and analyze that data, and the company’s new platform automatically and continuously scans for model updates or alterations. 

Apostolopoulos said additional factors will be taken into account as Endor collects more data. The company will also eventually expand to other platforms beyond Hugging Face, such as commercial providers including OpenAI

“We will have a bigger story about the governance of AI, which is becoming important as more people start deploying it,” said Apostolopoulos. 

AI on a similar path as open-source development — but it’s much more complicated

There are many parallels between the development of AI and the development of open-source software (OSS), Apostolopoulos pointed out. Both have a multitude of options — as well as numerous risks. With OSS, software packages can introduce indirect dependencies that hide vulnerabilities. 

Similarly, the vast majority of models on Hugging Face are based on Llama or other open source options. “These AI models are pretty much dependencies,” said Apostolopoulos. 

AI models are typically built on, or are essentially extensions of, other models, with developers fine-tuning to their specific use cases. This creates what he described as a “complex dependency graph” that is difficult to both manage and secure.

“At the bottom somewhere, five layers deep, there is this foundation model,” said Apostolopoulos. Getting clarity and transparency can be difficult, and the data that is available can be convoluted and “quite painful” for people to read and understand. It’s hard to determine what exactly is contained in model weights, and there are no lithographic ways to ensure that a model is what it claims to be, is trustworthy, as advertised and that it doesn’t produce toxic content. 

“Basic testing is not something that can be done lightly or easily,” said Apostolopoulos. “The reality is there is very little and very fragmented information.”

While it’s convenient to download open source, it’s also “extremely dangerous,” as malicious actors can easily compromise it, he said. 

For instance, common storing formats for model weights can allow arbitrary code execution (Or when an attacker can gain access and run any commands or code that they please). This can be particularly dangerous for models built on older formats such as PyTorch, Tensorflow and Keras, Apostolopoulos explained. Also, deploying models may require downloading other code that is malicious or vulnerable (or that can attempt to import dependencies that are). And, installation scripts or repositories (as well as links to them) can be malicious. 

Beyond security, there are numerous licensing obstacles, too: Similar to open-source, models are governed by licenses, but AI introduces new complications because models are trained on datasets that have their own licenses. Today’s organizations must be aware of intellectual property (IP) used by models as well as copyright terms, Apostolopoulos emphasized. 

“One important aspect is how similar and different these LLMs are from traditional open source dependencies,” he said. While they both pull in outside sources, LLMs are more powerful, larger and made up of binary data. 

Open-source dependencies get “updates and updates and updates,” while AI models are “fairly static” — when they’re updated, “you most likely won’t touch them again,” said Apostolopoulos. 

“LLMs are just a bunch of numbers,” he said. “They’re much more complex to evaluate.” 



Source link