Holo3 pushes computer-use AI toward the open frontier
H Company’s Holo3 pairs a benchmark-leading gated flagship with an Apache 2.0 smaller model, giving developers a real open-weight foothold in computer-use AI.

Holo3 matters less because H Company posted a big score and more because the open side of computer-use AI just got a lot less theoretical.
H Company launched Holo3 with the kind of headline that makes this beat fun and slightly exhausting: 78.85% on OSWorld-Verified, a new claimed high score on a serious computer-use benchmark. In AI, that is usually when a chart appears, everyone squints, and the press release starts doing chest exercises.
The interesting part is not the chart. H Company launched a family of computer-use models with a split personality. The flagship Holo3-122B-A10B is the score-chasing model from the launch page: paid-tier only and research-only for licensing. The smaller Holo3-35B-A3B is the one on Hugging Face under Apache 2.0, with free-tier API access and a stated self-hosting path. Open builders are not getting the absolute top model, but they are getting something more useful than a demo: an open-weight foothold unusually close to the frontier.
"Holo3" sounds like one neat product. It is not. It is a family photo where one sibling is allowed to leave the house and the other has a velvet rope around it.
That matters operationally. If you are deciding whether Holo3 is a benchmark headline or a thing you can actually deploy, the answer depends entirely on which sibling in that family portrait you mean.

What H Company actually released with Holo3
According to H Company’s launch page and model API docs, the Holo3 line currently breaks into two public tiers.
The first is Holo3-35B-A3B, a sparse MoE model with 35 billion total parameters and 3 billion active parameters. The model card says it is based on Qwen3.5-35B-A3B, tuned for GUI agents, released under Apache 2.0, and available for self-hosting; H Company lists it at $0.25 per million input tokens and $1.80 per million output tokens, with a 10-RPM free tier.
The second is Holo3-122B-A10B, the larger flagship with 122 billion total parameters and 10 billion active parameters. That is the model H Company says reached 78.85% on OSWorld-Verified. It is paid-API only, priced at $0.40 per million input tokens and $3.00 per million output tokens, and marked research-only for licensing.
That split is the launch. This is not a clean "we opened our frontier model" story. It is a "we opened the smaller model and kept the top shelf gated" story. Different claim, still meaningful, and very much in line with the incentive math behind open-weight model inference economics.
What is open in Holo3, and what definitely is not
The open part is real. H Company published the 35B-A3B weights on Hugging Face under Apache 2.0. Its API FAQ also says the model can be self-hosted and loaded with Transformers. In a category where "open" often means "here is a gated demo and one benchmark table," that counts.
The benchmark-leading 122B-A10B model is not open, not commercially available under an open license, and not described as self-hostable. H Company’s training stack is closed too: the launch page leans on an "agentic learning flywheel," a proprietary "Synthetic Environment Factory," and an internal "H Corporate Benchmark" of 486 enterprise-style tasks. Useful internal assets, maybe. Community-verifiable building blocks, no.
So Holo3 is not a fully open lab notebook. It is a hybrid strategy: open enough to win developers, closed enough to keep the crown jewels inside. We saw a different flavor of that tradeoff in Ai2's MolmoWeb launch.
Why the OSWorld-Verified claim matters, and why it is not the whole story
This is the part where the benchmark deserves both respect and a chaperone.
OSWorld is one of the better public reference points in computer use because it evaluates agents across web tasks, desktop apps, file operations, and cross-app workflows in a real computer environment. The project describes 369 tasks and execution-based evaluation rather than pure vibes, and mid-2025 it upgraded the benchmark into OSWorld-Verified with fixes, refreshed results, and hosted verified trajectories.
That makes the claim more interesting than a lab inventing its own obstacle course and pinning a medal on itself. The GitHub docs say verified leaderboard entries run under unified settings and are checked by the maintainers. That is adult supervision. This market could use more of it.
Still, H Company’s Holo3 scores should be read as H Company’s launch materials and model-card claims unless and until outside validation settles in. A strong benchmark is not the same thing as watching a model survive messy production work, flaky page loads, permission prompts, or whatever cursed internal dashboard your finance team has been hauling around since 2017.

That is why this story is bigger than one percentage point. Benchmarks tell you about ceiling, grounding, and control quality. They do not tell you whether deployment is sane. For that you still need runtime boundaries, approval logic, and a security model like the one in AI agent sandbox beats raw filesystem access.
Why Holo3 changes the open-versus-closed computer-use race
The closed players still have the easier commercial pitch. Products like the ones behind Claude Code's remote computer-use push and the broader browser-agent race sell convenience first: managed runtime, polished user surface, and fewer things for the customer to assemble.
Holo3 matters because it gives the open side a stronger answer than "we have a framework and a dream." An Apache 2.0 model at this performance level lets smaller teams test GUI agents with fewer licensing headaches, fine-tune around their own workflows, and self-host the smaller model so screenshots, traces, and action history stay where they want them.
That does not mean open computer-use AI has won. The flagship score still belongs to the closed model, the best training assets are still private, and even the open 35B model still needs permissions, recovery, evaluation, and a plan for not turning the agent into an enthusiastic intern with root access.
Before launches like this, the frontier computer-use story belonged mostly to proprietary systems and tightly managed product surfaces. After Holo3, the open side has something closer to a contender. Not the whole crown. A hand on it. In this category, the gap between "open in theory" and "open enough to build with" is enormous. Holo3 narrows it.
Source file
Public source trail
These links anchor the package to the underlying reporting trail. They are not a substitute for judgment, but they do show where the reporting starts.
Official launch post with the benchmark claim, model-family framing, training flywheel description, and positioning against proprietary models.
Confirms the Apache 2.0 license, 35B total and 3B active architecture, Qwen3.5 base, and the 77.8 OSWorld-Verified claim for the open model.
Provides the public pricing, free-tier access, self-hosting note for the 35B model, and the paid-tier plus research-only status of the 122B flagship.
Defines the benchmark environment, task count, and the OSWorld-Verified update that tightened the benchmark surface.
Documents the verified benchmark process, unified evaluation language, and public trajectory-hosting details used to explain why OSWorld-Verified is a meaningful reference point.
Useful as a thin early pickup showing that public recap coverage is still sparse and mostly repeats the launch framing.

About the author
Lena Ortiz
Lena tracks the economics and mechanics behind AI systems, from serving architecture and open-weight deployment to developer tooling, platform shifts, product decisions, and the operational tradeoffs that shape what teams actually run. Her reporting is aimed at builders and operators deciding what to trust, adopt, and maintain.
- 19
- Apr 2, 2026
- Berlin
Archive signal
Reporting lens: Operating leverage beats ideological posturing.. Signature: If the cost curve moves, the product strategy moves with it.
Article details
- Category
- Open Source AI
- Last updated
- April 2, 2026
- Public sources
- 6 linked source notes
Byline

Covers the economics, tooling, and operating realities that shape how AI gets built, shipped, and run.



