Private Beta · v0.7.1

Your Code.
Your Machine.
Your AI.

AI-powered coding assistant that runs 100% on your machine. No cloud APIs. No subscriptions. No data exfiltration. Just a single binary, local models, and your VS Code.

$ localdev setup
✓ Detected: Windows x86_64 · NVIDIA RTX 3060
✓ Ollama running · v0.5.4
✓ Model: qwen2.5-coder:7b (fits GPU)
✓ Config created · Daemon tested
✓ VS Code extension installed

✓ Setup complete — ready to code with AI
Scroll

Every line of code you send to the cloud is a line of code someone else can read. We built an alternative.

0 bytes
Data Sent to Cloud
1 binary
No Docker · No Python
15 min
Zero to Working
$0 /mo
No Subscriptions
Comparison
How We Stack Up
LocalDev isn't competing on model intelligence — cloud models are smarter today. We're competing on everything else that matters.
Feature LocalDev Studio GitHub Copilot Cursor Sourcegraph Cody
Data Privacy ● 100% local ● Cloud ● Cloud ● Optional local
Works Offline ● Full offline ● No ● No ● Partial
Air-Gap Compatible ● Yes ● No ● No ● No
Cost $0 forever $19/mo $20/mo $9/mo
Model Choice ● Any Ollama model ● GPT-4o only ● Limited set ● Limited set
Chat Sidebar ● Yes ● Yes ● Yes ● Yes
Inline Completions ● Yes (FIM) ● Yes ● Yes ● Yes
Context Engine ● Git + Deps + Errors ● Open files only ● Codebase indexing ● Code graph
MCP Servers ● Built-in (FS, Terminal) ● No ● No ● No
Setup Time ● 1 command ● Extension + login ● New IDE install ● Extension + login
Vendor Lock-in ● None ● Microsoft ● Cursor Inc ● Sourcegraph
Downtime Risk ● Zero — runs locally ● Cloud outages ● Cloud outages ● Cloud outages
BCP Compatible ● Survives vendor failure ● Vendor-dependent ● Vendor-dependent ● Vendor-dependent
Fine-Tuning ● QLoRA on your code ● Not available ● Not available ● Not available
Scales to Cluster ● Laptop to Mac Studio ● Per-seat pricing ● Per-seat pricing ● Per-seat pricing
Advantage Partial Cloud-dependent
Data Sovereignty
Where Does Your Code Go?
LocalDev Studio
Your Machine
GitHub Copilot
IDE
Microsoft
OpenAI Servers
Response
Cursor
IDE
Cursor Proxy
Cloud Model
Response
On your machine Code sent to third party Local process
Features
Built for Real Development
Not a toy. Not a wrapper. A complete AI development infrastructure designed for professional workflows.

One-Command Setup

localdev setup detects your OS, GPU, and VRAM. Automatically selects the right model (1.5B→32B), installs Ollama, creates config, adds to PATH, installs the VS Code extension, and runs an inference test. 8 steps, fully automated, works on Windows, macOS, and Linux.

🧠

Smart Context Engine

BM25-ranked file search, git diff injection, dependency graph traversal, and compiler error forwarding. The model sees what matters — not your entire codebase. Context window budget is optimized per-task: code generation gets more code context, explanations get more documentation.

🔌

MCP Protocol (Built-in)

Model Context Protocol servers for filesystem access and terminal execution — built into the daemon. The AI can read files, list directories, and run commands through a standardized JSON-RPC 2.0 interface. Custom MCP servers plug in via config.

🎯

Intelligent Model Routing

Different tasks route to different model configurations. Code generation uses higher temperature and more context tokens. Explanations use lower temperature for accuracy. You can assign different Ollama models to different task types — use a fast 3B for completions, a thorough 7B for reviews.

🖥️

VS Code Extension — Not Just a Chat Window

Chat sidebar with markdown rendering, syntax-highlighted code blocks (Copy + Insert buttons), elapsed timer for CPU inference, and auto-reconnect. Inline ghost-text completions with fill-in-the-middle inference. Code actions: Explain, Fix, Verify (cross-model review), Generate Tests. CodeLens integration shows actions above every function. Status bar shows connection state, model info, and generation speed.

🔒

True Air-Gap Operation

After initial model download, zero network traffic required. The daemon, Ollama, and models all run on localhost. No telemetry, no usage tracking, no phone-home. Verified: run with firewall blocking all outbound and it works identically.

📐

Cross-Model Verification

The "Verify" action sends your code to a second model configuration for independent review. Catches bugs, logic errors, and edge cases that the generating model missed. Like having a second pair of eyes — but both pairs are local.

Architecture
How It All Connects
Your Machine — Air-Gap Safe VS Code Extension v0.7.1 Chat · Completions · Actions Terminal CLI · Interactive Chat HTTP LocalDev Daemon Port 9741 Router Context Engine MCP Host Diagnostics Ollama Inference Engine GPU / CPU Compute MCP Servers Filesystem · Terminal · Custom Qwen 2.5 7B / 32B Local Weights Your Code Git · Deps · Errors
Hardware
Runs on What You Already Have
The setup wizard auto-detects your hardware and selects the optimal model. No manual configuration required.
🖥️

NVIDIA GPU (6GB+)

RTX 3060 and above. 7B model runs fully on GPU.

~20 tokens/sec
🍎

Apple Silicon

M1/M2/M3/M4. Unified memory makes 7B models fly.

~25 tokens/sec
🐧

Linux + CUDA/ROCm

NVIDIA or AMD GPUs. Works with any Ollama-supported hardware.

~20 tokens/sec
💻

CPU Only

Any modern x86_64 with 8GB+ RAM. Slower but identical quality.

~3 tokens/sec
Resilience & Scale
Zero Downtime. Infinite Uptime.
Cloud AI tools go down. Ours doesn't. When your AI runs on your machine, the only outage is a power outage.
LocalDev Studio
Always Available
100%
GitHub Copilot
Available
~97.5%
Cursor
Available
~96.8%
On your machine (always up) Outage incidents (cloud dependency)
97.5% uptime = ~18 hours of downtime per month. When you're debugging production at 2 AM, that's when cloud goes down.
🛡️

Business Continuity Ready

Cloud AI is a single point of failure. Provider goes down? Your team stops. Provider gets acquired? Terms change. Provider gets sanctioned? Access revoked. LocalDev eliminates all three risks. Your AI toolchain survives vendor bankruptcy, geopolitical sanctions, pricing changes, and API deprecations. It's the only AI coding tool you can write into a BCP plan.

⬆️

Zero Downtime, Zero Excuses

No API rate limits. No "service degraded" banners. No "usage cap reached, try again tomorrow." When the model runs on your hardware, 3 AM on a Sunday works exactly like 10 AM on a Tuesday. Your AI availability is bounded by your power supply, not someone else's infrastructure budget.

From a Single Laptop to a Mac Studio Cluster

Same binary. Same config. Scale by adding hardware, not subscriptions.

💻

Solo Developer

1 laptop
CPU or GPU
7B model

🖥️

Small Team

Shared workstation
RTX 4090 / A4000
7B–32B models

🍎

Mac Studio Cluster

M2/M4 Ultra nodes
192GB unified memory
70B+ models

🏢

Enterprise

On-prem GPU servers
Multi-model routing
Team-wide inference

Apple Mac Studio with M2 Ultra (192GB) runs 70B models at ~15 tokens/sec.
A cluster of 3 Mac Studios gives your team GPT-4-class inference with zero cloud dependency.
🔬

Fine-Tuning for Your Codebase

Generic models give generic suggestions. Fine-tune on your own codebase to get completions that know your naming conventions, your architecture patterns, your internal APIs. LocalDev supports QLoRA-based fine-tuning workflows — adapt a 7B model to your codebase with as little as a single consumer GPU and a few hours of training time. The fine-tuned model stays on your infrastructure, trained on your proprietary code, producing suggestions that feel like they came from a senior developer who's read every PR.

Your CodebaseGit history · PRs · Docs
Data PipelineExtract · Clean · Format
QLoRA Training4-bit · Single GPU · Hours
Your ModelCustom weights · Local
Better CodeYour patterns · Your APIs
🌐

Open Source Model Improvements

We don't just consume open-source models — we improve them. Our fine-tuning techniques, evaluation benchmarks, and inference optimizations are contributed back to the community. Better base models mean better LocalDev for everyone. When Qwen, DeepSeek, or CodeLlama improve, you benefit automatically — just ollama pull the latest and your tooling gets smarter overnight.

🔄

Model Hot-Swap

New model released? Pull it and switch in 60 seconds. No waiting for vendor integration, no feature-flag rollouts, no "coming soon." The open-source model ecosystem moves fast — Ollama gives you access to every model the day it drops. Mistral, Qwen, DeepSeek, CodeLlama, StarCoder — swap freely, keep the one that works best for your stack.

Getting Started
One Command. That's It.
Terminal
$ localdev setup

╔══════════════════════════════════════╗
║ LocalDev Studio Setup ║
╚══════════════════════════════════════╝

[1/8] Detecting system
OS: Windows · Arch: x86_64 · GPU: NVIDIA RTX 3060 (12288 MB)

[2/8] Checking PATH
✓ Added to user PATH

[3/8] Checking Ollama
✓ Ollama v0.5.4 running

[4/8] Selecting model
Selected: qwen2.5-coder:7b-instruct-q4_K_M (fits GPU)

[5/8] Pulling model
✓ Already installed

[6/8] Creating configuration
✓ Created .localdev.yml

[7/8] Testing inference
✓ Model responded: "Hello!"

[8/8] VS Code extension
✓ Installed

✓ Setup complete!
Run: localdev serve → then open VS Code
Why Now
The Timing is Right

7B Models Are Good Enough

Qwen 2.5 Coder 7B scores 80%+ on HumanEval. For autocomplete, explain, and code review — local models handle 90% of daily developer needs. The gap with cloud models shrinks every quarter.

Regulation Is Coming

EU AI Act. SEC cybersecurity rules. NIST AI RMF. Organizations in defense, finance, healthcare, and government increasingly cannot send code to third-party AI providers. LocalDev is compliant by architecture.

Consumer GPUs Are Powerful

An RTX 3060 (12GB, ~$300 used) runs 7B models at 20 tokens/sec. Apple M-series unified memory handles 7B natively. The hardware barrier is gone — the software just wasn't ready. Until now.

MCP Changes Everything

Anthropic's Model Context Protocol creates a standard for tool integration. LocalDev implements MCP natively — your local models can access filesystem, terminal, and custom servers through the same protocol that powers cloud AI tools.

Target Users
Built For Teams Who Can't Compromise

Defense & Government

Air-gapped networks, ITAR compliance, classified codebases. LocalDev operates with zero network traffic after setup. NIST SP 800-171 compatible by design.

Financial Services

SOC 2 Type II, SEC Rule 10b-5, FINRA data protection. Trading systems, risk models, and proprietary algorithms never touch third-party servers.

Healthcare & Biotech

HIPAA, FDA 21 CFR Part 11. Code that processes PHI or operates medical devices stays local. Full audit trail, zero data exfiltration risk.

Solo Developers

No $20/month subscription, no usage limits, no API key management. Download, setup, code. Works the same whether you're online or on an airplane.

Private Beta

Ready to Own Your AI?

Tell us about your setup. We'll send you the binary, a setup guide, and direct support during the beta.

We'll respond within 24 hours · No spam, ever · Your data stays in our inbox

✓ Request Received

We'll review your setup and send the beta package to your email within 24 hours.
In the meantime, check out the Getting Started Guide.

v0.7.1 · Limited beta slots
Email
Documentation
Company