Local LLM Chat

A private AI chatbot, summarizer, and rewrite assistant. Uses WebGPU via WebLLM when available, with a lightweight Xenova summarizer fallback for any browser. Runs 100% offline after the model is cached.

Active Model

Hello! I am a private AI. My brain is being loaded directly into your browser's GPU. I cannot connect to the internet, and nothing you type here leaves your device.

Preparing Gemma 2 (2B) - Excellent Reasoning

Checking WebGPU access...

Downloading once, then cached locally

Runs privately in this browser

Large LLM models can take several minutes on the first load. After the download finishes, the model stays cached in this browser.

This tool runs entirely in your browser. No data is sent anywhere.