Model Behavior: Open Source AI

Market overviews

•

Oct 7, 2025

Model Behavior: Open Source AI

Copy link

Copied

Copy link

Copied

Copy link

Copied

Back to Resources

Open source AI in 2025 is seeing remarkable breakthroughs in model development and deployment. From DeepSeek’s highly impressive R1 LLMs to growing innovation across monitoring, observability, and dev tools, the focus has shifted to building models that are compact, cost-efficient, and production-ready.

We mapped the landscape back in Aug’24, which feels like an age ago. This is what the market looked like then:

Open source models (Llama, Mistral, Qwen) were closing the gap on frontier closed models, in terms of MMLU and Elo benchmarks.
There was a surge in startup activity in 2024, with 150+ players innovating across model training, finetuning, and monitoring.
More than 40% of the startups funded in 2024 were Series A+, indicating a strong focus on growth-stage investments.
Model training and dev tools were the most heavily funded segments, accounting for 60% of the total funding in the sector.
Alibaba’s Qwen family had the highest Huggingface downloads in June’24, surpassing Llama and Mistral.

In 2025, the story has changed. Open source models are farther from proprietary models on complex reasoning benchmarks like MMLU-Pro and Humanity’s Last Exam. However, lower cost and high customizability are becoming a strong value proposition, driving real adoption.

Competition is intensifying in open source AI infra, with massive fundraises, targeted M&A deals, and a race to win enterprise adoption.

Here’s a closer look at the latest developments shaping Open Source AI in 2025.

We mapped the landscape back in Aug’24, which feels like an age ago. This is what the market looked like then:

Open source models (Llama, Mistral, Qwen) were closing the gap on frontier closed models, in terms of MMLU and Elo benchmarks.
There was a surge in startup activity in 2024, with 150+ players innovating across model training, finetuning, and monitoring.
More than 40% of the startups funded in 2024 were Series A+, indicating a strong focus on growth-stage investments.
Model training and dev tools were the most heavily funded segments, accounting for 60% of the total funding in the sector.
Alibaba’s Qwen family had the highest Huggingface downloads in June’24, surpassing Llama and Mistral.

Competition is intensifying in open source AI infra, with massive fundraises, targeted M&A deals, and a race to win enterprise adoption.

Here’s a closer look at the latest developments shaping Open Source AI in 2025.

We mapped the landscape back in Aug’24, which feels like an age ago. This is what the market looked like then:

Open source models (Llama, Mistral, Qwen) were closing the gap on frontier closed models, in terms of MMLU and Elo benchmarks.
There was a surge in startup activity in 2024, with 150+ players innovating across model training, finetuning, and monitoring.
More than 40% of the startups funded in 2024 were Series A+, indicating a strong focus on growth-stage investments.
Model training and dev tools were the most heavily funded segments, accounting for 60% of the total funding in the sector.
Alibaba’s Qwen family had the highest Huggingface downloads in June’24, surpassing Llama and Mistral.

Competition is intensifying in open source AI infra, with massive fundraises, targeted M&A deals, and a race to win enterprise adoption.

Here’s a closer look at the latest developments shaping Open Source AI in 2025.

Open Source AI Ecosystem: A DeepSeek reckoning

Open source AI has moved beyond initial hype into a steadily growing developer ecosystem.
Dev traction on GitHub and Huggingface shows sustained momentum. Compact instruction-tuned models like Llama and Qwen are heavily downloaded developer darlings.

DeepSeek-AI’s R1 marks a breakthrough in scalable AI reasoning.
Trained with reinforcement learning instead of costly supervised data, they solve novel, multi-step problems instead of just following instructions. This opens up a more scalable path to “thinking” models, while putting pressure on incumbents to innovate faster in reasoning-centric AI.

Until now, such advances have come from closed labs like OpenAI and Anthropic. DeepSeek proves open players can push the frontier too.

Race for enterprise AI adoption is heating up, but the pie is split down the middle, as orgs adopt a hybrid approach.
Over 50 % of organizations report using open source AI tools across the stack, often alongside closed systems for security and scale. IBM's study of over 2,400 IT decision makers confirms this: 51% of businesses using open source tools saw positive ROI, compared to just 41% of those that weren't.

“Open washing” is a rising concern.
Models branded as “open” are sometimes not really open; they often withhold key elements, such as pre-training data, finetuning steps, weights, or only offer restricted licenses.

Market Map: New entrants are innovating across monitoring and observability use cases, as tech giants go on an M&A spree

Monitoring and observability are the hottest segments.
Dev tooling and infra startups have flooded the AI market, but now all eyes are on observability. As organizations move models into production, the need for continuous evaluation, debugging, and governance has driven the accelerated formation of AIOps and infrastructure startups.

Players like Arize AI and Fiddler AI, once early movers in ML monitoring, are now expanding into full-stack observability, shaping the category into mission-critical infra.

Enablement tools and platforms are driving “productization” of AI.
The focus has shifted from model development to model operationalization. Investors are pouring into frameworks, middleware, and dev tools, such as Baseten and Langchain, turning open models into usable products.

Tech giants are doubling down on AI infra through strategic M&A.
Anthropic acqui-hired Humanloop for prompt tooling, Rubrik bought Predibase to bring enterprise AI into mainstream IT stacks, and Nvidia acquired Gretel to power training with synthetic data, signaling a race to own key layers of the AI stack.

CoreWeave bought OpenPipe (LLM optimization) and Soda acquired NannyML (Model monitoring), underscoring rapid consolidation in AI monitoring, as enterprise bets move away from model R&D to reliable infra.

Funding Landscape: Mega rounds and billion-dollar valuations dominate 2025

Model training and dev tools accounted for 60% of the total open source AI funding in 2024. The next wave of funding continues the trend, with VCs pouring dollars into finetuning, monitoring, and observability.

Open source AI funding accelerated in 2025, with mega rounds defining the dealmaking landscape.
The sector raised $3.5B+ across multiple funding rounds, with late-stage deals capturing the majority of investment activity.

Supabase, a contender for future funding rounds in our 2024 coverage, raised $100M Series E at a $5B valuation, led by Accel and Peak XV with participation from Figma Ventures, Accel, and other returning investors. This comes just five months after its last raise, reflecting strong market confidence in open source infra.
Mistral AI closed a mega €1.7B round led by chipmaking equipment manufacturer ASML.
Baseten raised $150M Series D at a $2.15B valuation, led by BOND with CapitalG joining.

Series A is active and diversified with E2b (AI infra), LlamaIndex (Data orchestration), and Ultralytics (Computer Vision) securing sizable rounds, indicating strong early momentum across infra, dev tooling, and model optimization.

Investor interest in AI observability surged at the seed stage, with newcomers like Promptlayer (prompt optimization and LLM monitoring) and Confident AI (LLM evaluation and monitoring platform) securing early funding.

Growth-stage rounds saw the biggest capital raises, led by Series C rounds (Fal.ai with $197M and Arize AI with $131M), showing investor appetite for scaling proven infra players.

Leading investors and incumbents, including Nvidia, Meta, Microsoft, Alphabet, Y Combinator, Accel, Coatue, Andreessen Horowitz, and Sequoia Capital, have been highly active in the space.

Nvidia continues its run of investments, including the recent acquisition of the synthetic data platform Gretel and major bets on startups like Mistral AI, MindsDB, and Cohere. The chip giant also backs closed source players like xAI, Scale AI, and Lambda, and of course the much talked about $100B OpenAI partnership.

Model Performance: Open source models are progressing, but still trailing the frontier

The benchmark gap between open and closed models was narrowing rapidly in our Aug’24 coverage, with Meta Llama and Mistral scoring almost the same as GPT 4o in MMLU. However, the gap has widened again in 2025.

Closed source models continue to dominate the leaderboard, with Gemini-2.5-Pro, Grok-4, and GPT-5 holding the top spots.
Open source players like Z.ai’s GLM-4.5 and DeepSeek’s deepseek-v3.1 have made strong gains, but are yet to achieve parity.

New open source challengers, such as Tencent, BigScience, and other emerging labs, are also pushing frontier performance beyond Big Tech’s dominance.

Closed source shows a 2x accuracy advantage on Humanity's Last Exam (HLE), though open models show superior confidence calibration.
HLE features 2,500 complex reasoning questions across math, logic, coding, and cross-domain tasks, evaluating the ability of LLMs to handle sophisticated, graduate-level reasoning. It reports on Accuracy (% of correct answers) and Calibration Error (how well confidence matches correctness).

Closed source models maintain the accuracy edge. Grok 4 and GPT-5 lead at 25% accuracy versus top open source models DeepSeek (14%) and Qwen (12%).
Open source models perform comparatively better on calibration. DeepSeek and Qwen variants show better confidence calibration (73-78% error) compared to GPT-5 (50% error) and Gemini 2.5 Pro (72% error).

Model Performance: Open source models are progressing, but still trailing the frontier

New open source challengers, such as Tencent, BigScience, and other emerging labs, are also pushing frontier performance beyond Big Tech’s dominance.

Closed source models maintain the accuracy edge. Grok 4 and GPT-5 lead at 25% accuracy versus top open source models DeepSeek (14%) and Qwen (12%).
Open source models perform comparatively better on calibration. DeepSeek and Qwen variants show better confidence calibration (73-78% error) compared to GPT-5 (50% error) and Gemini 2.5 Pro (72% error).

Model Performance: Open source models are progressing, but still trailing the frontier

New open source challengers, such as Tencent, BigScience, and other emerging labs, are also pushing frontier performance beyond Big Tech’s dominance.

Closed source models maintain the accuracy edge. Grok 4 and GPT-5 lead at 25% accuracy versus top open source models DeepSeek (14%) and Qwen (12%).
Open source models perform comparatively better on calibration. DeepSeek and Qwen variants show better confidence calibration (73-78% error) compared to GPT-5 (50% error) and Gemini 2.5 Pro (72% error).

Model & Tool Popularity: Developer activity shows heavy concentration at the top, with Qwen, DeepSeek, and others gaining ground

LLMs on the OpenRouter leaderboard are ranked by total token usage, aggregated from developer activity. Thus, reflecting true popularity and utility.

Adoption is highly concentrated among closed source models.
The top 5 models account for approximately 12.9T of 15.6T total tokens (83% share). Grok Code Fast 1 alone commands 4.5T tokens, nearly 73% more than the second-place Claude Sonnet 4 (2.6T).

Open source models have achieved meaningful adoption, though substantially below proprietary models.
DeepSeek V3.1 (free) ranks #6 with 748B tokens, outperforming GPT-4.1 Mini (657B) and newer releases like Claude Sonnet 4.5 (298B). Combined DeepSeek variants total 1.35T tokens, representing roughly 9% of tracked usage.

Meta and Alibaba models lead Huggingface downloads in 2025.
Llama variants also captured the top spots, demonstrating Meta’s open source strategy driving massive developer uptake.

Alibaba’s Qwen family retained a strong footing against Meta’s dominance.
Multiple Qwen variants from 0.6B to 32B parameters ranked high on the chart, reflecting sustained dev adoption across its model sizes. Meanwhile, former leaders Falcon and Bloom fell sharply, signaling a shift towards efficiency and deployment flexibility over model size.

Anthropic, Supabase, and LangChain captured the most GitHub stars in Q2’25
GitHub stars act as a proxy for dev mindshare, showing where the attention is flowing in open source AI.

Core models and tools like huggingface/transformers (3.6K new stars) remain vital even as developers flock to newer frameworks and apps.

Key Takeaways

The frontier is still proprietary, but open source offers a compelling value proposition
Proprietary models like Gemini 2.5 Pro, Grok 4, and GPT-5 dominate leaderboards, especially on reasoning benchmarks like Humanity’s Last Exam. However, open source challengers like Llama, DeepSeek, and Qwen are available to use, download, and iterate at ~1/20th the landed cost.
Mega rounds dominated open source AI fundraising in 2025
The sector raised $3.5B+, with mega deals like Supabase ($100M), Mistral (€1.7B), and Baseten ($150M). Series A deals span infra (E2b, LlamaIndex) and computer vision (Ultralytics), while seed funding flowed to AI observability (Promptlayer, Confident AI). Nvidia is the most prominent strategic investor, doing acquisitions (Gretel) and backing the likes of Mistral, Cohere, and OpenAI.
Monitoring and observability are the hottest segments
As orgs move models into production and require ongoing evaluation, debugging, and governance, major infra players like Arize AI and Fiddler are evolving model monitoring into full-stack observability. In parallel, startups like Baseten and LangChain are leading the push to operationalize open models and make them production-ready.
Enterprise is taking a hybrid approach to AI adoption
Over 50% of orgs now use both open source and closed systems across their AI stack, balancing flexibility with security and scale. They're deploying frontier models for reasoning-intensive tasks and small language models for specialized and on-premise use cases. IBM's study of 2,400+ IT leaders confirms the value: 51% using open source report positive ROI versus 41% using closed systems only.

Which open source AI players are breaking out in 2025? We mapped 80+ high-potential open source AI startups gaining momentum across GitHub, Hugging Face, and other alternative datasets.

Check out their funding, founder, and growth details here.

Back to Resources

Enjoyed the analysis? Get the next one delivered to your inbox.