System Architecture

SEEV is built on a strict local-only constraint. Every layer in the stack executes entirely on your device.

Layered System Architecture

Presentation Layer
User-facing interface components rendered in WKWebView
Chat Interface Sidebar Navigation Composer Bar Control Center Voice Input UI Theme Engine
Application Layer
Core business logic, content processing, and data management
Markdown Renderer Memory System RAG Pipeline Chat Manager Web Search Engine Context Assembler File Processor
AI Runtime Layer
On-device model inference through the bundled local runtime
Local Runtime LFM 2.5 1.2B Flash LFM2-VL-3B WhisperKit Tokenizer
Storage Layer
All persistence is local — nothing leaves your device
IndexedDB localStorage Cache Storage Local File System
Interface
Logic
AI Inference
Persistence

AI Models and Runtime

SEEV uses a bundled fast local model, an optional deeper reasoning model, and Smart Hybrid routing across them, with external providers remaining optional.

LFM 2.5 — 1.2B Flash

Fastest bundled local responses

1.2B
Parameters
Flash
Default preset
Local
Runtime
Instant
Availability

LFM2-VL-3B — Deep Reasoning

Optional local download for deeper reasoning and stronger attachment understanding

3B
Parameters
Uploads
Multimodal input
Routed
Use mode
Optional
Install mode

Smart Hybrid — Routed

Uses LFM 1.2B for speed and LFM2-VL-3B for reasoning or uploads

1.2B
Fast path
3B
Reasoning path
Auto
Routing
1.2B now
Default availability

Model Inference Pipeline

The complete journey of a prompt through SEEV's on-device AI engine, structured by the current local model lineup, saved memory, uploaded document text, and conversation history — from input to streamed output.

1
User Input
Prompt entered via text or voice
2
Context Assembly
System prompt, saved memory, and local document context merged
3
Tokenization
Text converted to token IDs by model tokenizer
4
Local Model Runtime
Forward pass through the selected local model runtime
5
Autoregressive Decode
Token-by-token generation with sampling
6
Markdown Render
Response parsed and rendered with syntax highlighting
7
Local Persist
Conversation saved to IndexedDB
Pre/Post Processing
Local AI inference
Optional network only when invoked

Why a Bundled Runtime

SEEV packages the interface and local inference stack together so the app can run private on-device workflows without depending on a remote model service. The result is a local-first setup with faster startup for the bundled 1.2B model and optional expansion when you install the 3B model.

  • Local execution — core inference stays on your machine
  • App-bundled stack — UI and local backend ship together
  • No account setup — bundled local use works without cloud onboarding
  • Optional expansion — install the 3B reasoning model only when you want it
Local runtime architecture

Technology Stack

Every layer of SEEV is built with proven, high-performance technologies.

SwiftUI Shell

Native macOS application shell providing lightweight window management, menu bar integration, and system-level controls.

WKWebView Engine

High-performance web rendering engine for the app interface, with modern CSS and JavaScript APIs inside the macOS shell.

Local Model Paths

SEEV ships with a bundled fast model, supports an optional local 3B reasoning install, and routes between them with Smart Hybrid behavior.

WhisperKit (Local)

WhisperKit runs on-device for speech-to-text transcription. Audio is processed locally and never transmitted.

Local Data Store

Workspace state, conversations, memories, and settings stay on-device in local storage and IndexedDB, matching the app's private-first design.

Background Tasks

Parsing, search, and response rendering are handled without blocking the interface, keeping the workspace responsive while the app works locally.

Marked + Highlight.js

Rich markdown rendering with syntax highlighting and copyable code blocks.

Generation Presets

Curated inference configurations for different use cases, adjustable from the Control Center.

Precision

Low temperature, high accuracy. Ideal for factual queries, code generation, and technical tasks.

temp: 0.3 · top-p: 0.85

Creative

Higher temperature for brainstorming, writing, and exploratory conversation.

temp: 0.9 · top-p: 0.95

Balanced

A middle ground for general conversations balancing accuracy and fluency.

temp: 0.6 · top-p: 0.90

Quick

Faster, shorter responses. Suitable for rapid lookups and brief Q&A.

temp: 0.4 · max: 512 tokens

Performance Specifications

Engineered for efficient local inference on modern Mac hardware.

~1.2 GB
Local cache footprint
1.2B
Fast bundled model
3B
Deep reasoning model
WASM
Local runtime

Advanced Technology.
On Your Terms.

Experience the next generation of local AI. Download SEEV and put the power of private intelligence in your hands.

Download SEEV
Installation Instructions: After downloading the DMG, before opening the app, you must open Terminal and run sudo xattr -cr /Applications/SEEV.app. Enter your password when prompted to complete the authorization process and launch the application.