Getting Started — LocalDev Studio

What is LocalDev Studio?

LocalDev Studio is an AI-powered coding assistant that runs entirely on your machine. No cloud APIs, no subscriptions, no data leaves your laptop. It's a single binary that connects to Ollama for local AI inference and integrates with VS Code.

100% Local

Your code never leaves your machine. No API keys, no internet required for operation. Fully air-gap compatible.

Single Binary

One Go executable. No Python, no Node.js, no Docker. Download and run — works on Windows, macOS, and Linux.

VS Code Extension

Chat sidebar, inline completions (ghost text), code actions — Explain, Fix, Verify, Generate — all powered by local models.

Architecture

Everything runs locally. The daemon is the central hub — VS Code and the CLI both talk to it over localhost HTTP. The daemon routes requests to Ollama, which runs the model on your GPU or CPU.

Prerequisites

Required

Requirement	Details
RAM	8 GB minimum, 16 GB recommended
Disk	~5 GB (binary + model)
OS	Windows 10+, macOS 12+, Linux

Optional

Component	Benefit
NVIDIA GPU 6GB+	10-20x faster inference
Apple Silicon	Excellent with unified memory
VS Code	For IDE integration

No GPU? No problem. LocalDev works on CPU — responses take 30-90 seconds instead of 2-5 seconds, but the quality is identical. The VS Code extension shows an elapsed timer so you always know it's working.

Installation — One Command

The setup wizard handles everything: detecting your hardware, installing Ollama, choosing the right model, creating config, and wiring up VS Code.

1. Detect System

2. Add to PATH

3. Check Ollama

4. Select Model

5. Pull Model

6. Create Config

7. Test Inference

8. VS Code

Windows

macOS

Linux

Download the binary

Download localdev.exe from the releases page and save it anywhere (e.g. C:\tools\).

Install Ollama

If you don't already have Ollama, install it first:

PS> winget install Ollama.Ollama

Or download from https://ollama.com/download/windows

Run the setup wizard

Open PowerShell, navigate to where you saved the binary, and run:

PS> .\localdev.exe setup

The wizard detects your GPU, selects the right model, pulls it, creates config, adds itself to PATH, and installs the VS Code extension. Answer 2-3 questions and you're done.

Open a new terminal

After setup adds localdev to your PATH, open a new PowerShell window for the change to take effect.

Windows Defender. On first run, SmartScreen may show "Windows protected your PC." Click More info → Run anyway. This is normal for unsigned binaries.

Download and prepare

$ curl -L -o localdev https://github.com/neuralweaves/localdev/releases/latest/download/localdev-darwin-arm64 $ chmod +x localdev

Use localdev-darwin-amd64 for Intel Macs.

Install Ollama

$ brew install ollama

Or download from https://ollama.com/download/mac

Run the setup wizard

$ ./localdev setup

The wizard detects your Apple Silicon, selects an optimized model, and configures everything. It will offer to add localdev to your .zshrc.

Apple Silicon performance. M1/M2/M3 Macs run 7B models at 15-30 tokens/second using unified memory. The experience is very responsive.

Download and prepare

$ curl -L -o localdev https://github.com/neuralweaves/localdev/releases/latest/download/localdev-linux-amd64 $ chmod +x localdev

Install Ollama

$ curl -fsSL https://ollama.com/install.sh | sh

Run the setup wizard

$ ./localdev setup

The wizard detects NVIDIA (nvidia-smi) or AMD (ROCm) GPUs, selects the right model, and offers to add to .bashrc or symlink to /usr/local/bin.

NVIDIA drivers. Ensure you have CUDA-compatible drivers installed. Run nvidia-smi to verify. AMD ROCm is also supported.

Your First 5 Minutes

After setup completes, you have two ways to use LocalDev:

Option A: Terminal Chat

# Start the daemon (keep this terminal open) $ localdev serve ✓ Daemon listening on 127.0.0.1:9741 ✓ Ollama connected (1 models available) # In another terminal — ask a question $ localdev chat "write a hello world HTTP server in Go" # Or start an interactive chat session $ localdev chat

Option B: VS Code

Start the daemon: localdev serve

Open VS Code. Look for $(sparkle) LocalDev in the bottom-right status bar — it should show connected.

Open the LocalDev Studio panel in the right sidebar. Type a message and hit Send.

Try code actions: select some code, right-click, and choose LocalDev: Explain or LocalDev: Fix.

CPU inference note. If you're on CPU, the first request takes 30-90 seconds (model loading into RAM). The chat panel shows an elapsed timer — "Generating... 12s... 45s" — so you always know it's working. Subsequent requests are faster.

Model Recommendations

Hardware	Recommended Model	Speed	Quality
NVIDIA GPU 20GB+	qwen2.5-coder:32b-instruct-q4_K_M	Fast	Best
NVIDIA GPU 6-20GB	qwen2.5-coder:7b-instruct-q4_K_M	Fast	Good
Apple Silicon (16GB+)	qwen2.5-coder:7b-instruct-q4_K_M	Fast	Good
Apple Silicon (8GB)	qwen2.5-coder:3b-instruct-q4_K_M	Fast	Decent
NVIDIA GPU 2-6GB	qwen2.5-coder:1.5b-instruct-q4_K_M	Fast	Decent
CPU only (8GB+ RAM)	qwen2.5-coder:7b-instruct-q4_K_M	~3 t/s	Good
CPU only (speed priority)	qwen2.5-coder:1.5b-instruct-q4_K_M	~8 t/s	Decent

The setup wizard auto-selects the right model for your hardware. You can override with localdev setup --model-size=small|medium|large.

VS Code Features

Chat Sidebar

Ask questions about your code, request explanations, generate functions. Responses render with syntax-highlighted code blocks that have Copy and Insert buttons.

Inline Completions

Ghost text suggestions as you type, powered by fill-in-the-middle inference. Accepts with Tab. Debounced to avoid hammering during fast typing.

Code Actions

Select code → right-click: Explain (what does this do?), Fix (fix bugs/errors), Verify (cross-model code review), Generate Tests.

Status Bar

$(sparkle) LocalDev = connected. $(circle-slash) LocalDev = daemon not running. Click to see available models. Auto-reconnects every 15 seconds.

Configuration

The setup wizard creates .localdev.yml in your project directory. Key settings you might want to tune:

# .localdev.yml — generated by localdev setup daemon: host: 127.0.0.1 port: 9741 ollama: host: http://127.0.0.1:11434 timeout: 300 # seconds — increase for CPU models: balanced: name: qwen2.5-coder ollama_tag: qwen2.5-coder:7b-instruct-q4_K_M tier: balanced tasks: [codegen, explain, testgen, review, docgen] context: max_tokens: 2048 # context window for code include_git: true # include git diff in context include_deps: true # include imported files include_errors: true # include compiler errors

VS Code Settings

Setting	Default	Description
`localdev.daemon.host`	127.0.0.1	Daemon hostname
`localdev.daemon.port`	9741	Daemon port
`localdev.daemon.chatTimeoutMs`	300000	Chat timeout (5 min) — increase for CPU
`localdev.daemon.completeTimeoutMs`	60000	Inline completion timeout (1 min)
`localdev.completion.enabled`	true	Enable/disable ghost text
`localdev.completion.debounceMs`	300	Delay before triggering completion
`localdev.codeLens.enabled`	true	Show CodeLens actions above functions

Troubleshooting

Daemon won't start

Check if Ollama is running: visit http://localhost:11434 in your browser. If it returns "Ollama is running", the issue is with the daemon. Check the output of localdev serve for error messages.

VS Code shows $(circle-slash) disconnected

Make sure the daemon is running (localdev serve) in a separate terminal. The extension auto-reconnects every 15 seconds. Check Output → LocalDev Studio for connection logs.

Chat responses are empty

Open VS Code Developer Tools (Ctrl+Shift+I) and check the Console for JavaScript errors. If you see errors, reinstall the extension: code --install-extension localdev-studio-0.7.1.vsix

Very slow responses on CPU

This is expected. CPU inference on a 7B model runs at ~3 tokens/second. Options: switch to the 1.5B model (localdev setup --model-size=small) for ~8 t/s, or use a machine with a dedicated GPU.

"localdev" is not recognized

The binary isn't on your PATH. Either re-run localdev setup from the binary's directory, or add its location manually. On Windows: setx PATH "%PATH%;C:\path\to\folder" then open a new terminal.

CLI Reference

Command	Description
`localdev setup`	One-time setup wizard (detect hardware, install model, configure)
`localdev serve`	Start the daemon (HTTP API + Ollama bridge)
`localdev chat`	Interactive terminal chat session
`localdev chat "prompt"`	Single-shot chat (answer and exit)
`localdev init`	Create .localdev.yml in current directory
`localdev models`	List configured and available models
`localdev bench`	Benchmark model inference speed

Setup flags. --skip-vscode skips VS Code extension install. --model-size=small|medium|large forces a specific model tier. --force re-runs setup even if already configured.

Providing Feedback

This is a beta release. We'd love to hear what works, what breaks, and what's missing. File issues at the GitHub repository or reach out directly.

v0.7.1

Current Version

8 steps

Setup Wizard

0 bytes

Data Sent to Cloud

Getting Started with LocalDev Studio

What is LocalDev Studio?

100% Local

Single Binary

VS Code Extension

Architecture

Prerequisites

Required

Optional

Installation — One Command

Download the binary

Install Ollama

Run the setup wizard

Open a new terminal

Download and prepare

Install Ollama

Run the setup wizard

Download and prepare

Install Ollama

Run the setup wizard

Your First 5 Minutes

Option A: Terminal Chat

Option B: VS Code

Model Recommendations

VS Code Features

Chat Sidebar

Inline Completions

Code Actions

Status Bar

Configuration

VS Code Settings

Troubleshooting

Daemon won't start

VS Code shows $(circle-slash) disconnected

Chat responses are empty

Very slow responses on CPU

"localdev" is not recognized

CLI Reference

Providing Feedback