Skip to main content

Magentic-UI + Fara-7B: A Local-First Computer-Use Agent You Can Actually Run

Nitin Kumar Singh
Author
Nitin Kumar Singh
I build enterprise AI solutions and cloud-native systems. I write about architecture patterns, AI agents, Azure, and modern development practices — with full source code.
Magentic-UI + Fara-7B: A Local-First Computer-Use Agent You Can Actually Run

If you’ve been waiting for a computer-use agent you can run on your own machine — sandboxed, human-in-the-loop, and open source — Microsoft’s Magentic-UI is the repo to clone this week. It now leads with MagenticLite (“Big tasks. Small models.”), a redesign from Microsoft Research AI Frontiers that pairs a small-model orchestrator with a dedicated browser-use model, and keeps you in control the whole way.

What it is
#

Magentic-UI is a research prototype for agentic web and computer tasks — form filling, price research, bookings, file organization — that you drive alongside the agent rather than handing it a blank cheque. MagenticLite splits the work across two models:

  • MagenticBrain — the orchestrator, designed to run well on small models.
  • Fara — a specialized browser-use model (Fara-7B) that actually drives the page.

Browser sessions run inside a lightweight VM sandbox called Quicksand, “so the agent can’t reach the rest of your machine without your say-so.” MIT-licensed, uv pip install "magentic_ui>=0.2.0", Python 3.12.

Why it’s worth exploring
#

  • Local-first, bring-your-own-endpoint. It talks to any OpenAI-compatible endpoint, so you can run the whole loop against a model on your own hardware — no frontier API required.
  • Human-in-the-loop by design. You co-plan the task, then co-task — stepping in to steer, approve, or take over at any point. It stops and checks in before critical actions instead of barreling ahead.
  • Real isolation. The Quicksand sandbox contains the browser so a misbehaving agent (or a prompt-injected page) can’t touch your filesystem unprompted.
  • Small-model economics. “Big tasks. Small models.” is the whole point — capable agentic behaviour without frontier-scale compute or cost.
  • A reference you can read. MIT license and a real UI make it a working blueprint if you’re building your own governed agent front-end.

Fara-7B, the interesting part
#

Fara-7B is Microsoft’s first agentic small model built for computer use:

  • 7B parameters, based on Qwen2.5-VL-7B, supervised fine-tuned on ~145K synthetic task trajectories generated through a Magentic-One multi-agent pipeline.
  • Works like a human — it reads the screen and acts with mouse and keyboard across multi-step tasks, rather than calling bespoke tool APIs.
  • Efficient. Microsoft reports it as state-of-the-art among small computer-use models and competitive with much larger ones; independent write-ups clock it at ~16 steps per task on average versus ~41 for UI-TARS-1.5-7B.
  • Runs where you want. MIT-licensed on Hugging Face and Microsoft Foundry, with quantised GGUF builds for LM Studio and Ollama, a fara-cli, and vLLM serving on a single 24 GB GPU.

The local-first setup, concretely
#

Point Magentic-UI’s model config at a local OpenAI-compatible endpoint — e.g. http://localhost:5000/v1 — served by vLLM, Ollama, or LM Studio running Fara-7B. The browser pane, the model, and your data all stay on your machine; the agent loop costs you GPU time, not per-token cloud spend. For anyone in a regulated shop who can’t send screens and page contents to a hosted API, that’s the difference between “interesting demo” and “something I can actually pilot.”

Bottom line
#

A local, sandboxed, human-in-the-loop computer-use agent with an MIT license is a rare combination. Clone magentic-ui, serve Fara-7B locally, and you have a governed browser agent running end-to-end on your own hardware — and a reference architecture worth studying before you build your own.