Getting Started with LocalDev Studio

Setup Guide · v0.7.1 · All Platforms

What is LocalDev Studio?

LocalDev Studio is an AI-powered coding assistant that runs entirely on your machine. No cloud APIs, no subscriptions, no data leaves your laptop. It's a single binary that connects to Ollama for local AI inference and integrates with VS Code.

100% Local

Your code never leaves your machine. No API keys, no internet required for operation. Fully air-gap compatible.

Single Binary

One Go executable. No Python, no Node.js, no Docker. Download and run — works on Windows, macOS, and Linux.

VS Code Extension

Chat sidebar, inline completions (ghost text), code actions — Explain, Fix, Verify, Generate — all powered by local models.

Architecture

VS Code Extension Terminal CLI Chat HTTP LocalDev Daemon Port 9741 Router · Context · MCP Ollama Inference Engine MCP Servers Filesystem · Terminal Qwen 2.5 Local Model Your Machine — Nothing Leaves

Everything runs locally. The daemon is the central hub — VS Code and the CLI both talk to it over localhost HTTP. The daemon routes requests to Ollama, which runs the model on your GPU or CPU.

Prerequisites

Required

RequirementDetails
RAM8 GB minimum, 16 GB recommended
Disk~5 GB (binary + model)
OSWindows 10+, macOS 12+, Linux

Optional

ComponentBenefit
NVIDIA GPU 6GB+10-20x faster inference
Apple SiliconExcellent with unified memory
VS CodeFor IDE integration
No GPU? No problem. LocalDev works on CPU — responses take 30-90 seconds instead of 2-5 seconds, but the quality is identical. The VS Code extension shows an elapsed timer so you always know it's working.

Installation — One Command

The setup wizard handles everything: detecting your hardware, installing Ollama, choosing the right model, creating config, and wiring up VS Code.

1. Detect System
2. Add to PATH
3. Check Ollama
4. Select Model
5. Pull Model
6. Create Config
7. Test Inference
8. VS Code
Windows
macOS
Linux
1

Download the binary

Download localdev.exe from the releases page and save it anywhere (e.g. C:\tools\).

2

Install Ollama

If you don't already have Ollama, install it first:

PS> winget install Ollama.Ollama

Or download from https://ollama.com/download/windows

3

Run the setup wizard

Open PowerShell, navigate to where you saved the binary, and run:

PS> .\localdev.exe setup

The wizard detects your GPU, selects the right model, pulls it, creates config, adds itself to PATH, and installs the VS Code extension. Answer 2-3 questions and you're done.

4

Open a new terminal

After setup adds localdev to your PATH, open a new PowerShell window for the change to take effect.

Windows Defender. On first run, SmartScreen may show "Windows protected your PC." Click More info → Run anyway. This is normal for unsigned binaries.
1

Download and prepare

$ curl -L -o localdev https://github.com/neuralweaves/localdev/releases/latest/download/localdev-darwin-arm64 $ chmod +x localdev

Use localdev-darwin-amd64 for Intel Macs.

2

Install Ollama

$ brew install ollama

Or download from https://ollama.com/download/mac

3

Run the setup wizard

$ ./localdev setup

The wizard detects your Apple Silicon, selects an optimized model, and configures everything. It will offer to add localdev to your .zshrc.

Apple Silicon performance. M1/M2/M3 Macs run 7B models at 15-30 tokens/second using unified memory. The experience is very responsive.
1

Download and prepare

$ curl -L -o localdev https://github.com/neuralweaves/localdev/releases/latest/download/localdev-linux-amd64 $ chmod +x localdev
2

Install Ollama

$ curl -fsSL https://ollama.com/install.sh | sh
3

Run the setup wizard

$ ./localdev setup

The wizard detects NVIDIA (nvidia-smi) or AMD (ROCm) GPUs, selects the right model, and offers to add to .bashrc or symlink to /usr/local/bin.

NVIDIA drivers. Ensure you have CUDA-compatible drivers installed. Run nvidia-smi to verify. AMD ROCm is also supported.

Your First 5 Minutes

After setup completes, you have two ways to use LocalDev:

Option A: Terminal Chat

# Start the daemon (keep this terminal open) $ localdev serve ✓ Daemon listening on 127.0.0.1:9741 ✓ Ollama connected (1 models available) # In another terminal — ask a question $ localdev chat "write a hello world HTTP server in Go" # Or start an interactive chat session $ localdev chat

Option B: VS Code

1

Start the daemon: localdev serve

2

Open VS Code. Look for $(sparkle) LocalDev in the bottom-right status bar — it should show connected.

3

Open the LocalDev Studio panel in the right sidebar. Type a message and hit Send.

4

Try code actions: select some code, right-click, and choose LocalDev: Explain or LocalDev: Fix.

CPU inference note. If you're on CPU, the first request takes 30-90 seconds (model loading into RAM). The chat panel shows an elapsed timer — "Generating... 12s... 45s" — so you always know it's working. Subsequent requests are faster.

Model Recommendations

Hardware Recommended Model Speed Quality
NVIDIA GPU 20GB+ qwen2.5-coder:32b-instruct-q4_K_M Fast Best
NVIDIA GPU 6-20GB qwen2.5-coder:7b-instruct-q4_K_M Fast Good
Apple Silicon (16GB+) qwen2.5-coder:7b-instruct-q4_K_M Fast Good
Apple Silicon (8GB) qwen2.5-coder:3b-instruct-q4_K_M Fast Decent
NVIDIA GPU 2-6GB qwen2.5-coder:1.5b-instruct-q4_K_M Fast Decent
CPU only (8GB+ RAM) qwen2.5-coder:7b-instruct-q4_K_M ~3 t/s Good
CPU only (speed priority) qwen2.5-coder:1.5b-instruct-q4_K_M ~8 t/s Decent

The setup wizard auto-selects the right model for your hardware. You can override with localdev setup --model-size=small|medium|large.

VS Code Features

Chat Sidebar

Ask questions about your code, request explanations, generate functions. Responses render with syntax-highlighted code blocks that have Copy and Insert buttons.

Inline Completions

Ghost text suggestions as you type, powered by fill-in-the-middle inference. Accepts with Tab. Debounced to avoid hammering during fast typing.

Code Actions

Select code → right-click: Explain (what does this do?), Fix (fix bugs/errors), Verify (cross-model code review), Generate Tests.

Status Bar

$(sparkle) LocalDev = connected. $(circle-slash) LocalDev = daemon not running. Click to see available models. Auto-reconnects every 15 seconds.

Configuration

The setup wizard creates .localdev.yml in your project directory. Key settings you might want to tune:

# .localdev.yml — generated by localdev setup daemon: host: 127.0.0.1 port: 9741 ollama: host: http://127.0.0.1:11434 timeout: 300 # seconds — increase for CPU models: balanced: name: qwen2.5-coder ollama_tag: qwen2.5-coder:7b-instruct-q4_K_M tier: balanced tasks: [codegen, explain, testgen, review, docgen] context: max_tokens: 2048 # context window for code include_git: true # include git diff in context include_deps: true # include imported files include_errors: true # include compiler errors

VS Code Settings

Setting Default Description
localdev.daemon.host 127.0.0.1 Daemon hostname
localdev.daemon.port 9741 Daemon port
localdev.daemon.chatTimeoutMs 300000 Chat timeout (5 min) — increase for CPU
localdev.daemon.completeTimeoutMs 60000 Inline completion timeout (1 min)
localdev.completion.enabled true Enable/disable ghost text
localdev.completion.debounceMs 300 Delay before triggering completion
localdev.codeLens.enabled true Show CodeLens actions above functions

Troubleshooting

Daemon won't start

Check if Ollama is running: visit http://localhost:11434 in your browser. If it returns "Ollama is running", the issue is with the daemon. Check the output of localdev serve for error messages.

VS Code shows $(circle-slash) disconnected

Make sure the daemon is running (localdev serve) in a separate terminal. The extension auto-reconnects every 15 seconds. Check Output → LocalDev Studio for connection logs.

Chat responses are empty

Open VS Code Developer Tools (Ctrl+Shift+I) and check the Console for JavaScript errors. If you see errors, reinstall the extension: code --install-extension localdev-studio-0.7.1.vsix

Very slow responses on CPU

This is expected. CPU inference on a 7B model runs at ~3 tokens/second. Options: switch to the 1.5B model (localdev setup --model-size=small) for ~8 t/s, or use a machine with a dedicated GPU.

"localdev" is not recognized

The binary isn't on your PATH. Either re-run localdev setup from the binary's directory, or add its location manually. On Windows: setx PATH "%PATH%;C:\path\to\folder" then open a new terminal.

CLI Reference

CommandDescription
localdev setupOne-time setup wizard (detect hardware, install model, configure)
localdev serveStart the daemon (HTTP API + Ollama bridge)
localdev chatInteractive terminal chat session
localdev chat "prompt"Single-shot chat (answer and exit)
localdev initCreate .localdev.yml in current directory
localdev modelsList configured and available models
localdev benchBenchmark model inference speed
Setup flags. --skip-vscode skips VS Code extension install. --model-size=small|medium|large forces a specific model tier. --force re-runs setup even if already configured.

Providing Feedback

This is a beta release. We'd love to hear what works, what breaks, and what's missing. File issues at the GitHub repository or reach out directly.

v0.7.1
Current Version
8 steps
Setup Wizard
0 bytes
Data Sent to Cloud